Tag: Kubernetes

  • Creating a simple Nginx-based web server image

    One of the hardest parts of blogging is identifying topics. I sometimes struggle with identifying things that I have done that would be interesting or helpful to others. In trying to establish a “rule of thumb” for such decisions, I think things that I have done at least twice qualify as potential topics. As it so happens, I have had to construct simple web server containers twice in the last few weeks.

    The Problem

    Very simply, I wanted to be able to build a quick and painless container to host some static web sites. They are mostly demo sites for some of the UI libraries that we have been building. One is raw HTML, the other is built using Storybook.js, but both end up being a set of HTML/CSS/JS files to be hosted.

    Requirements

    The requirements for this one are pretty easy:

    • Host a static website
    • Do not run as root

    There was no requirement to be able to change the content outside of the image: changes would be handled by building a new image.

    My Solution

    I have become generally familiar with Nginx for a variety of uses. It serves as a reverse proxy for my home lab and is my go-to ingress controller for Kubernetes. Since I am familiar with its configuration, I figured it would be a good place to start.

    Quick But Partial Success

    The “do not run as root” requirement led me to the Nginx unprivileged image. With that as a base, I tried something pretty quick and easy:

    # Dockerfile
    FROM nginxinc/nginx-unprivileged:1.20 as runtime
    
    
    COPY output/ /usr/share/nginx/html

    Where output contains the generated HTML files that I wanted to host.

    This worked great for the first page that loaded. However, links to other pages within the site kept coming back from Nginx with :8080 as the port. Out networking configuration is offloading SSL outside of the cluster and using ingress within the cluster, so I did not want any port forwarding at all.

    Custom Configuration Completes the Set

    At that point, I realized that I needed to configure Nginx to disabled the port redirects, and then include the new configuration in my container. So I trapsed through the documentation for the Nginx containers. As it turns out, the easiest way to configure these images is to replace the default.conf file in /etc/nginx/conf.d folder.

    So I went about creating a new Nginx config file with the appropriate settings:

    server { 
      listen 8080;
      server_name localhost;
      port_in_redirect off;
      
      location / {
        root /usr/share/nginx/html;
        index index.html index.htm;
      }
      error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   /usr/share/nginx/html;
        }
    }

    From there, my Dockerfile changed only slightly:

    # Dockerfile
    FROM nginxinc/nginx-unprivileged:1.20 as runtime
    COPY nginx/default.conf /etc/nginx/conf.d/default.conf
    COPY output/ /usr/share/nginx/html

    Success!

    With those changes, the image built with the appropriate files and the links no longer had the port redirect. Additionally, my containers are not running as root, so I do not run afoul of our cluster’s policy management rules.

    Hope this helps!

  • Breaking an RKE cluster in one easy step

    With the release of Ubuntu’s latest LTS release (22.04, or “Jammy Jellyfish), I wanted to upgrade my Kubernetes nodes from 20.04 to 22.04. What I had hoped would be an easy endeavor turned out to be a weeks-long process with destroyed clusters and, ultimately, an ETCD issue.

    The Hypothesis

    As I viewed it, I had two paths to this upgrade: in-place upgrade on the nodes, or bring up new nodes and decommission the old ones. As the latter represents the “Yellowstone” approach, I chose that one. My plan seemed simple:

    • Spin up new Ubuntu 22.04 nodes using Packer.
    • Add the new nodes to the existing clusters, assigning the new nodes all the necessary roles (I usually have 1 control_plane, 3 etcd, and all are worker
    • Remove the control_plane from the old node and verify connectivity
    • Remove the old nodes (cordon, drain, and remove)

    Internal Cluster

    After updating my Packer scripts for 22.04, I spun up new nodes for my internal cluster, which has an ELK stack for log collection. I added the new nodes without a problem, and thought that maybe I could combine the last two steps and just remove all the nodes at the same time.

    That ended up with the Rancher CLI getting stuck in checking ETCD health. I may have gotten a little impatient and just killed the Rancher CLI process mid-job. This left me with, well, a dead internal cluster. So, I recovered the cluster (see my note on cluster recovery below) and thought I’d try again with my non-production cluster.

    Non-production Cluster

    Some say that the definition of insanity is doing the same thing and expecting a different result. My logic, however, was that I made two mistakes the first time through:

    • Trying to remove the controlplane alongside of the etcd nodes in the same step
    • Killing the RKE CLI command mid-stream

    So I spun up a few new non-production nodes, added them to the cluster, and simply removed controlplane from the old node.

    Success! My new controlplane node took over, and cluster health seemed good. And, in the interest of only changing one variable at a time, I decided to try and remove just one old node from the cluster.

    Kaboom….

    Same issue in recovering the etcd volume. So I recovered the cluster and returned to the drawing board.

    Testing

    At this point, I only had my Rancher/Argo cluster and my production cluster, which houses, among other things, this site. I had no desire for wanton destruction of these clusters, so I setup a test cluster to see if I could replicate my results. I was able to, at which point I turned to the RKE project in Github for help.

    After a few days, someone pointed me to a relatively new Rancher issue describing my predicament. If you read through those various issues, you’ll find that etcd 3.5 has an issue where node removal can corrupt its database, causing issues such as mine. The issue was corrected in 3.5.3.

    I upgraded my RKE CLI and ran another test with the latest rancher Kubernetes version. This time, finally, success! I was able to remove etcd nodes without crashing the cluster.

    Finishing up / Lessons Learned

    Before doing anything, I upgraded all of my clusters to the latest supported Kubernetes version. In my case, this is v1.23.6-rancher1-1. Following the steps above, I was, in fact, able to progress through upgrading both my Rancher/Argo cluster and my production cluster without bringing down the clusters.

    Lessons learned? Well, patience is key (don’t kill cluster management processes mid-effort), but also, sometimes it is worth a test before you try things. Had any of these clusters NOT been my home lab clusters, this process, seemingly simple, would have incurred downtime in more important systems.

    A note on Cluster Recovery

    For both the internal and non-production clusters, I could have scrambled to recover the ETCD volume for that cluster and brought it back to life. But I realized that there was no real valuable data in either cluster. The ELK logs are useful real-time but I have not started down the path of analyzing history, so I didn’t mind losing them. And even those are on my SAN, and the PVCs get archived when no longer in use.

    Instead of a long, drawn out recovery process, I simply stood up brand new clusters, pointed my instance of Argo to them and updated my Argo applications to deploy to the new cluster. Inside of an hour, my apps were back up and running. This is something of a testament to the benefits of storing a cluster’s state in a repository: recreation was nearly automatic.

  • GitOps: Moving from Octopus to … Octopus?

    Well, not really, but I find it a bit cheeky that ArgoCD’s icon is, in fact, an orange octopus.

    There are so many different ways to provision and run Kubernetes clusters that, without some sense of standardization across the organization, Kubernetes can become an operations nightmare. And while a well-run operations environment can allow application developers to focus on their applications, a lack of standards and best practices for container orchestration can divert resources from application development to figuring out how a particular cluster was provisioned or deployed.

    I have spent a good bit of time at work trying to consolidate our company approach to Kubernetes. Part of this is working with our infrastructure teams to level set on what a “bare” cluster looks like. The other part is standardizing how the clusters are configured and managed. For this, a colleague turned me onto GitOps.

    GitOps – Declarative Management

    I spent a LOT of time just trying to understand how all of this works. For me, the thought of “well, in order to deploy I have to change a Git repo” seemed, well, icky. I like the idea that code changes, things are built, tested, and deployed. I had a difficult time separating application code from declarative state. Once I removed that mental block, however, I was bound and determined to move my home environments from Octopus Deploy pipelines to ArgoCD.

    I did not go on this journey alone. My colleague Oleg introduced me to ArgoCD and the External Secrets operator. I found a very detailed article with flow details by Florian Dambrine on Medium.com. Likewise, Burak Kurt showed me how to have Argo manage itself. Armed with all of this information, I started my journey.

    The “short short” version

    In an effort to retain the few readers I have, I will do my best to consolidate all of this into a few short paragraphs. I have two types of applications in my clusters: external tools and home-grown apps.

    External Tools

    Prior to all of this, my external tools were managed by running Helm upgrades periodically with values files saved to my PC. These tools included “cluster tools” which are installed on every cluster, and then things like WordPress and Proget for running the components of this site. Needless to say, this is highly dangerous: had I lost my PC, all those values files would be gone.

    I now have tool definitions stored either in the cluster’s specific Git repo, or in a special “cluster-tools” repository that allows me to define applications that I want installed to all (or some) of my clusters, based on labels in the cluster’s Argo secret. This allows me to update tools by updating the version in my Git repository and committing the change.

    It should be noted that, for these tools, Helm is still used to install/upgrade. More on this later.

    Home Grown Applications

    The home-grown apps had more of a development feel: feature branches push to a test environment, while builds from main get pushed to staging and then, upon my approval in Azure DevOps, pushed to production.

    Previous to conversion, every build produced a new container image and Helm chart, both of which were published to Proget. From there, Octopus Deploy took care of deploying feature builds to the test environment only, and took care of deploying to stage and production based on nudges from Azure DevOps Tasks.

    Using Florian’s described flow, I created a Helmfile repo for each of my projects, which allowed me consolidate the application charts into a single repository. Using Helmfile and Helm, I generate manifests that are then committed into the appropriate cluster’s Git repository. Each Helmfile repository has its own pipeline for generating manifests and committing them to the cluster repository, but my individual project pipelines have gotten very simple: build a new version, and change the Helmfile repository to reflect the new image version.

    Once the cluster’s repository is updated, Argo takes care of syncing the resources.

    Helm versus Manifests

    I noted that I’m currently using Helm for external tools versus generating manifests (albeit from Helm charts, but still generating manifests) for internal applications. As long as you never actually run a helm install command, then Argo will manage the application using manifests generated from the Helm application. However, from what I have seen, if you have previously run helm install, that application hangs around in the cluster. However, that applications history doesn’t change with new versions. So you can get into an odd state where helm list will show older versions than what are actually installed.

    When using a tool like ArgoCD, you want to let it manage your applications from beginning to end. It will keep your cluster cleaner. For the time being, I am defining external tools using Helm templates, but using Helmfile to expand my home-built applications into manifest files.

  • Tech Tip – Turn on forwarded headers in Nginx

    I have been using Nginx as a reverse proxy for some time. In the very first iteration of my home lab, it lived on a VM and allowed me to point my firewall rules to a single target, and then route traffic from there. It has since been promoted to a dedicated Raspberry Pi in my fight with the network gnomes.

    My foray into Kubernetes in the home lab has brought Nginx in as an ingress controller. While there are many options for ingress, Nginx seems the most prevalent and, in my experience, the easiest to standardize on across a multitude of Kubernetes providers. As we drive to define what a standard K8 cluster looks like across our data centers and public cloud providers, Nginx seemed like a natural choice for our ingress provider.

    Configurable to a fault

    The Nginx Ingress controller is HIGHLY configurable. There are cluster-wide configuration settings that can be controlled through ConfigMap entries. Additionally, annotations can be used on specific ingress objects to control behavior on individual ingress.

    As I worked with one team to setup Duende’s Identity Server, we started running into issues with the identity server endpoints using http instead of https in its discovery endpoints (such as /.well-known/openid-configuration. Most of our research suggest that the X-Forwarded-* headers needed to be configured (which we did), but we were still seeing the wrong scheme in those endpoints.

    It was a weird problem: I had never run into this issue in my own Identity Server instance, which is running in my home Kubernetes environment. I figured it had to do with an Nginx setting, but had a hard time figuring out which one.

    One blog post pointed me in the right direction. Our Nginx ingress install did not have the use-forwarded-headers setting configured in the ConfigMap, which meant the X-Forwarded-* headers were not being passed to the pod. A quick change of our deployment project, and the openid-configuration endpoint returned the appropriate schemes.

    For reference, we are using the ingress-nginx helm chart. Adding the following to our values file solved the issue:

    controller:
      replicaCount: 2
      
      service:
        ... several service settings
      config:
        use-forwarded-headers: "true"

    Investigation required

    What I do not yet know is, whether or not I randomly configured this at home and just forgot about it, or if it is a default of the Rancher Kubernetes Engine (RKE) installer. I use RKE at home to stand up my clusters, and one of the add-ons I have it configure is ingress with Nginx. Either I have settings in my RKE configuration to forward headers or it’s a default of RKE…. Unfortunately, I am at a soccer tournament this weekend, so the investigation will have to wait until I get home.

    Update:

    Apparently I did know about use-forwarded-headers earlier: it was part of the options I had set in my home Kubernetes clusters. One of many things I have forgotten.

  • Home Lab: Disaster Recovery and time for an Upgrade!

    I had the opportunity to travel the U.S. Virgin Islands last week on a “COVID delayed” honeymoon. I have absolutely no complaints: we had amazing weather, explored beautiful beaches, and got a chance to snorkel and scuba in some of the clearest water I have seen outside of a pool.

    Trunk Bay, St. John, US VI

    While I was relaxing on the beach, Hurricane Ida was wrecking the Gulf and dropping rain across the East, including at home. This lead to power outages, which lead to my home server having something of a meltdown. And in this, I learned the value of good neighbors who can reboot my server and the cost of not setting up proper disaster recovery in Kubernetes.

    The Fall

    I was relaxing in my hotel room when I got text messages from my monitoring solution. And, at first, I figured “The power went out, things will come back up in 30 minutes or so.” But, after about an hour, nothing. So I texted my neighbor and asked if he could reset the server. After he reset, most of my sites came back up, with the exception of some of my SQL-dependent sites. I’ve had some issues with SQL Server’s not starting their service correctly, so some sites were down… but that’s for another day.

    A few days later, I got the same monitoring alerts. My parents were house-sitting, so I had my mom reset the server. Again, most of my sites came up. Being in America’s Caribbean Paradise, I promptly forgot all about it, figuring things were fine.

    The Realization

    Boy, was I wrong. When I sat down at the computer on Sunday, I randomly checked my Rancher instance. Dead. My other clusters were running, but all of the clusters were reporting issues with etcd on one of the nodes. And, quite frankly, I was at a loss. Why?

    • While I have taken some Pluralsight courses on Kubernetes, I was a little overly dependent on the Rancher UI to manage Kubernetes. With it down, I was struggling a bit.
    • I did not take the time to find and read the etcd troubleshooting steps for RKE. Looking back, I could most likely have restored the etcd snapshots and been ok. Live and learn, as it were.

    Long story short, my hacking attempts to get things running again pretty much killed my rancher cluster and made a mess of my internal cluster. Thankfully, the stage and production clusters were still running, but with a bad etcd node.

    The Rebuild

    At this point, I decided to throw in the towel and rebuild the Rancher cluster. So I deleted the existing rancher machines and provisioned a new cluster. Before I installed Rancher, though, I realized that my old clusters might still try to connect to the new Rancher instance and cause more issues. I took the time to remove the Rancher components from the stage and production clusters using the documented scripts.

    When I did this with my internal tools cluster… well, it made me realize there was a lot of unnecessary junk on that cluster. It was only running ELK (which was not even fully configured) and my Unifi Controller, which I moved to my production box. So, since I was already rebuilding, I decided to decommission my internal tools box and rebuild that as well.

    With two brand new clusters and two RKE clusters clean of Rancher components, I installed Rancher and got all the management running.

    The Lesson

    From this little meltdown and reconstruction I have learned a few important lessons.

    • Save your etcd snapshots off machine – RKE takes snapshots of your etcd regularly, and there is a process for restoring from a snapshot. Had I known that those snapshots were there, I would have tried this before killing my cluster.
    • etcd is disk-heavy and “touchy” when it comes to latency – My current setup has my hypervisor using my Synology as an iSCSI disk for all my VMs. With 12 VMs running as Kubernetes nodes, any disruption in either network or I/O lag can cause leader changes. This is minimal during normal operation, but performing deployments or updates can sometimes cause issues. I have a small upgrade planned for the Synology to add a 1 TB SSD Read/Write cache which hopefully improves the issue, but I may end up creating a new subnet for iSCSI traffic to alleviate network hiccups.
    • Slow and steady wins the race – In my haste to get everything working again, I tried some things that did more harm than good. Had I done a bit more research and found the appropriate articles earlier, I probably could have recovered without rebuilding.

  • Hardening your Kubernetes Cluster: Don’t run as root!

    People sometimes ask my why I do not read for pleasure. As my career entails ingesting the NSA/CISA technical report on Kubernetes Hardening Guidance and translating it into actionable material, I ask that you let me enjoy hobbies that do not involve the written word.

    The NSA/CISA technical report on came out on Tuesday, August 3. Quite coincidentally, my colleagues and I have started asking questions about our internal standards for securing Kubernetes clusters for production. Coupled with my current home lab experimentation, I figured it was a good idea to read through this document and see what I could do to secure my lab.

    Hopefully, I will get through this document and how I’ve applied everything to my home lab (or, at least, the production cluster in my home lab). For now, though, I thought it prudent to look at the Pod Security section. And, as one might expect, the first recommendation is…

    Don’t run as root!

    For as long as I can remember working in Linux, not running as root was literally “step one.” So it amazes me that, by default, containers are configured to run as root. All of my home-built containers are pretty much the same: simple docker files that copy the results of an external dotnet publish command into the container and then run the dotnet entry point.

    My original docker files used to look like this:

    FROM mcr.microsoft.com/dotnet/aspnet:5.0-focal AS base
    WORKDIR /app
    COPY . /app
    
    EXPOSE 80
    ENTRYPOINT ["dotnet", "my.dotnet.dll"]

    With some StackOverflow assistance, now, it looks something like this:

    FROM mcr.microsoft.com/dotnet/aspnet:5.0-focal AS base
    WORKDIR /app
    COPY . /app
    # Create a group and user so we are not running our container and application as root and thus user 0 which is a security issue.
    RUN addgroup --system --gid 1000 mygroup \
        && adduser --system --uid 1000 --ingroup mygroup --shell /bin/sh nmyuser
      
    # Serve on port 8080, we cannot serve on port 80 with a custom user that is not root.
    ENV ASPNETCORE_URLS=http://+:8080
    EXPOSE 8080
      
    # Tell docker that all future commands should run as the appuser user, must use the user number
    USER 1000
    
    ENTRYPOINT ["dotnet", "my.dotnet.dll"]

    What’s with the port change?

    The docker file changes are pretty simple: add the commands to add a new group and a new user, and using the USER command, tell the docker file to execute as the new user. But why change the ASPNETCORE_URLS and port change? When running as a non-root user, you are restricted to ports above 1024, so I had to change the exposed port. This necessitated some changes to my helm charts and their service deployment, but, overall, the process was straightforward.

    My next steps

    When I find some spare time in next few months, I’m going to revisit Pod Security Policies and, more specifically, it’s upcoming replacement: PSP Replacement Policy. I find it amusing that the NSA/CISA released guidelines that specify usage of what is now a deprecated feature. I also find it typical of our field that our current name for the new version is, quite simply, “Pod Security Policy Replacement Policy.” I really hope they get a better name for that…

  • Moving the home lab to Kubernetes

    Kubernetes has become something of the standard for the orchestration of containers. While there are certainly other options, the Kubernetes platform remains the most prevalent. With that in mind, I decided to migrate my home lab from docker servers to Kubernetes clusters.

    Before: Docker Servers

    Long story short: my home lab has transitioned from Windows servers running IIS to a mix of Linux and Windows containers to Linux only containers. The apps are containerized, but the DBs still run on some SQL servers.

    Build and deployment is automated: Build through Azure DevOps Pipelines & Self Hosted Agents (Teamcity before that), and deployment through Octopus Deploy. Container images for my projects live on a Proget server feed.

    The Plan

    “Consolidate” (and I’ll tell you later why that is in quotes) my servers into Kubernetes Clusters. It seemed an easy plan.

    • Internal K8 Cluster – Runs Rancher and any internal tooling (including Elastic/Kibana) I want to be there, but not available externally
    • Non Production K8 Cluster – Runs my *.net and *.org sites, used for test and staging environments
    • Production K8 Cluster – Runs my *.com sites (external) including any external tooling.

    I spent some time learning Packer to provision Hyper-V vms for my clusters. The clusters all ended up with a control plane (4vCPU, 8GB RAM) and two workers (2vCPU, 6GB RAM).

    The Results

    The Kubernetes Clusters

    There was a LOT of trial and error in getting Kubernetes going, particularly with Rancher. So much, in fact, that I probably provisioned the clusters 3 or 4 times each because I felt like I messed up and wanted to do it over again.

    Initially, I tried to manually provision the K8 cluster. Yes, it worked.. but RKE is nicer. And, after my manually provisioned K8 cluster went down, I provisioned the internal cluster with RKE. That makes updates easier, as I have the config file.

    I provisioned the non-production and production clusters using the Rancher GUI. However, that was the “manually provisioned” cluster, so, when it went down, I lost the config files. I currently have two clusters which look like “imported” clusters in Rancher, so they are harder to manage through the Rancher GUI.

    Storage

    In order to utilize persistent volume claims, I configured NFS on my Synology and installed the nfs-subdir-external-provisioner in all of my clusters. It installs a storage class which can be used in persistent volume claims, and will provision directories in my NFS.

    Ingress

    Right now, I’m using the Nginx Ingress controller from Rancher. I haven’t played with it much, other than the basics. Perhaps more on that when I dig in.

    Current Impressions

    Rancher

    It works… but mine is flaky. I think it may be due to some resource starvation. I may try to provision a new internal cluster with better VMs and see how that works.

    I do like the deployment of clusters using RKE, however, I can see how it would be difficult to manage when there is more than one person involved.

    Kubernetes

    Once it was running, it’s great: creating new APIs or apps and getting them running in a scalable fashion is easy. Helm charts make deployment and updating a snap.

    That said, I would not trust myself to run this in production without a LOT more training.

    References