Note: This post was originally posted on my personal blog. I have copied the content to this blog.
(Links to code for my environment at the bottom)
I didn’t like how my main unRAID server was a single point of failure, so I set out to make a Kubernetes cluster. Initially I used Rancher to create it and it was very easy – simply install Docker and run a single command for the first master node, and a different command for the worker nodes to join.
This worked very well, and it was very useful to see cluster resources in a GUI. Creating and managing secrets and configmaps, and restarting pods was very simple.
I ran with a 2-node setup for quite a while (A VM running on my unRAID host for a master and an old laptop given second life as a node).
Eventually, the fragility of the setup and my need to continue to mess with things got me thinking about how I could improve. I brought in my old desktop that I had stopped playing games on (which would make younger versions of me very sad to hear that indeed) (I do still game on the Nintendo Switch) as another worker node. I configured GPU passthrough to the pods so both nodes could transcode on PleX more efficiently. This was where things started getting interesting, as I shuffled around pods on the nodes. At this point I was running primarily PleX and Bitcoin Core.
I also started using metal-lb. It’s very straightforward, and really cool to see IPs get picked out of the pool and just work on my network.
Running those it became clear I had a need for better storage management. I had to confine pods with storage to specific nodes because that’s where the storage was. I had a NAS but it’s already being used for a lot on my network, plus access times and transfer speeds for it paled in comparison to local storage.
I happened upon Longhorn and I thought it was a cool idea. I had seen other similar technologies (ones that were more mature even) but I felt experimental. So I installed it in my cluster (thanks Rancher app installer!).
Longhorn was deceptively easy to set up; just install and throw some storage at it. I ensured I had backups of my data and configured my workloads to use it.
It proved to be a little brittle but overall reliable enough for me to trust it with data (though there are some rules you NEED to know about Longhorn):
It was around this time that I started experimenting with other apps, such as ElectrumX and btc-rpc-explorer. They had storage requirements too, which gave me experience with managing multiple apps and volumes in Rancher.
While I was playing with all of that, I set up NewRelic and Terraform for my cluster as well. I had experience with NewRelic, and wanted to get to know Terraform better. My friend recommended Atlantis so I set that up. It was a little tricky but now I have PR planning and applying right from GitHub if I ever want it. I also started dabbling in PagerDuty. NewRelic and PagerDuty were both configured via Terraform as much as possible (I even ended up making a dashboard in Terraform code!)
I started feeling the restrictiveness of setting a cluster up with Rancher. I wrestled with kube-proxy and kube-dns (I could only run one instance or else my cluster’s DNS would have issues. I also had issues managing upgrades of Rancher, and documentation for Rancher generally assumed you didn’t do what I did since what I had was meant for development and demonstration purposes only, really.
So I set out to remake my whole cluster, lol.
I decided to go the kubeadm route. The only thinking I ended up having to do was what my subnets should have been (half of the whole 192.168 space should be good, right?)
Since I had a lot of what I did defined in code, it was pretty easy to re-set up everything I had. I think it took me about a day. At the same time, I added a new server so I could have 3 worker nodes (though the master is still a VM, which I’ll address at the end)
I also decided to set up Flux, which was mostly straightforward. I have a repo with a few Kustomization manifests and one helm chart that gets automatically re-applied any time there are changes.
After a good amount of tweaking and playing around with things, here’s what my cluster looks like now:
I also run the following applications:
With the master being a VM, I decided I will turn that physical and also have three of them so I can have a more robust cluster. I currently have a Dell Wyse 5040 ordered, which is a small Atom quad-core, 2GB of RAM computer normally meant to be a thin client or terminal rather than a full fledged computer. I’m hoping I can get away with spending about $120 each for those and some storage for each k8s master. The RAM won’t be an issue because the current master only uses 800MB of RAM (plus my cluster won’t expand much more after this), but it remains to be seen if a quad core Atom processor can keep up with the demand of a Kubernetes cluster.
My Servers: