Hitting for the cycle…

Well, I may not be hitting for the cycle, but I am certainly cycling Kubernetes nodes like it is my job. The recent OpenSSL security patches got me thinking that I need to cycle my cluster nodes.

A Quick Primer

In Kubernetes, a “node” is, well, a machine performing some bit of work. It could be a VM or a physical machine, but, at its core, it is a machine with a container runtime of some kind that runs Kubernetes workloads.

Using the Rancher Kubernetes Engine (RKE), each node can have one or more of the following roles

worker – The node can host user workloads
etcd – The node is a member of the etcd storage cluster
controlplane – The node is a member of the control plane

Cycle? What and why

When I say “cycling” the nodes, I am actually referring to the process of provisioning a new node, adding the new node to the cluster, and removing an old node from the cluster. I use the term “cycling” because, when the process is complete, my cluster is “back where it started” in terms of resourcing, but with a fresh and updated node.

But, why cycle? Why not just run the necessary security patches on my existing nodes? In my view, even at the home lab level, there a few reasons for this method.

A Clean and Consistent Node

As nodes get older, they incur the costs of age. They may have some old container images from previous workloads, or some cached copies of various system packages. In general, they collect stuff, and a fresh node has none of that cost. By provisioning new nodes, we can ensure that the latest provisioning is run and all the necessary updates are installed.

By using newly provisioned nodes each time, it prevents me from applying special configuration to nodes. If I need a particular configuration on a node, well, it has to be done in the provisioning scripts. All my cluster nodes are provisioned the same way, which makes them much more like cattle than pets.

No Downtime or Capacity Loss

Running node maintenance can potentially require a system reboot or service restarts, which can trigger downtime. In order to prevent downtime, it is recommended that nodes be “cordoned” (prevent new workload from being scheduled on that node) and “drained” (remove workloads from the node).

Kubernetes is very good at scheduling and moving workloads between nodes, and there is built-in functionality for cordon and draining of nodes. However, to prevent downtime, I have to remove a node for maintenance, which means I lose some cluster capacity during maintenance.

When you think about it, to do a zero-downtime update, your choices are:

Take a node out of the cluster, upgrade it, then add it back
Add a new node to the cluster (already upgraded) then remove the old node.

So, applying the “cattle” mentality to our infrastructure, it is preferred to have disposable assets rather than precisely manicured assets.

RKE performs this process for you when you remove a node from the cluster. That means, running rke up will remove old nodes if they have been removed from your cluster.yml.

To Do: Can I automate this?

As of right now, this process is pretty manual, and goes something like this:

Provision a new node – This part is automated with Packer, but I really only like doing 1 at a time. I am already running 20+ VMs on my hypervisor, I do not like the thought of spiking to 25+ just to cycle a node.
Remove nodes, one role at a time – I have found that RKE is most stable when you only remove one role at a time. What does that mean? It means, if I have a node that is running all three nodes, I need to remove the control plane role, then run rke up. Then remove the etcd role, and run rke up again. Then remove the node completely. I have not had good luck simply removing a node with all three roles.
Ingress Changes – I need to change the IPs on my cluster in two places:
- In my external load balancer, which is a Raspberry Pi running nginx.
- In my nginx-ingress installation on the cluster. This is done via my GitOps repository.
DNS Changes – I have aliases setup for the control plan nodes so that I can swap them in and out easily without changing other configurations. When I cycle a control plane node, I need to update the DNS.
Delete Old Node – I have a small Powershell script for this, but it is another step.

It would be wonderful if I could automate this into an Azure DevOps pipeline, but there are some problems with that approach.

Packer’s Hyper-V builder has to run on the host machine, which means I need to be able to execute the Packer build commands on my server. I’m not about about put the Azure DevOps agent directly on my server.
I have not found a good way to automate DNS changes, outside of using the Powershell module.
I need to automate the IP changes in the external proxy and the cluster ingress. Both of which are possible but would require some research on my end.
I would need to automate the RKE actions, specifically, adding new nodes and deleting roles/nodes from the cluster, and then running rke up as needed.

None of the above is impossible, however, it would require some effort on my part to research some techniques and roll them up into a proper pipeline. For the time being, though, I have set my “age limit” for nodes at 60 days, and will continue the manual process. Perhaps, after a round or two, I will get frustrated enough to start the automation process.