Tag: Kubernetes

  • From nginx to HAProxy: A Zero-Downtime Migration That Found a Dead Server Along the Way

    In Part 4, I moved m5proxy—my BananaPi M5 reverse proxy—off the ProCurve switch and directly onto the UCG Max. Edge service at the network edge. Clean topology. Fewer hops. Good.

    What I didn't fix was the underlying problem: nginx has no idea if the servers behind it are actually alive.

    The Problem With nginx (For This Use Case)

    nginx is great. I'm not here to bash it. But my setup exposed one of its real limitations: DNS-based backends with no health checking.

    My Kubernetes clusters are accessed via DNS round-robin. tfx-production.gerega.net resolves to multiple worker node IPs. nginx uses those IPs… once, at startup (or reload). After that, it holds them in memory. If a node goes down, nginx keeps routing to it until the next config reload. There's no "hey, this backend isn't responding" logic.

    The failure mode is unpleasant: requests to the dead node time out. nginx eventually gives up and returns a 502. Your service looks down even though the cluster is fine—you're just unlucky enough to land on the dead node.

    I'd lived with this because the clusters were stable. But the right fix was obvious: something that actually checks whether backends are alive.

    Enter HAProxy

    HAProxy solves this with three features I actually wanted:

    Active health checks. TCP probe every 10 seconds per node. Two failures → node is marked down. Two recoveries → node is marked up. No human intervention, no reload required.

    Dynamic DNS re-resolution. The server-template + resolvers combination means HAProxy re-queries DNS periodically. When node IPs change (cycling nodes, adding capacity), HAProxy picks it up automatically.

    Stats dashboard. A real-time view of every backend: how many slots are allocated, how many are up, request rates, response times. Port 9000, built in, no Grafana required.

    The Migration Plan: Parallel First, Cutover Later

    I wasn't going to just swap nginx for HAProxy and hope for the best. The plan was:

    1. Install HAProxy, run it on port 8443 while nginx stays on 443
    2. Validate every single backend through HAProxy before touching nginx
    3. Flip ports, stop nginx, done

    This is the "strangler fig" approach for proxies: the new thing runs alongside the old thing, gets tested under real conditions, and only takes over when you're confident.

    Phase 1: Install and Configure

    HAProxy 2.8.16 is in the Ubuntu 24.04 repos. Installation is one command.

    The trickier part was certificates. nginx uses separate ssl_certificate and ssl_certificate_key directives. HAProxy wants them concatenated into a single PEM file: fullchain first, then private key.

    for domain in mattgerega.net mattgerega.com mattgerega.org; do
        cat /etc/letsencrypt/live/$domain/fullchain.pem \
            /etc/letsencrypt/live/$domain/privkey.pem \
            > /etc/haproxy/certs/$domain.pem
        chmod 600 /etc/haproxy/certs/$domain.pem
    done

    Then a deploy hook in /etc/letsencrypt/renewal-hooks/deploy/ so certbot rebuilds the PEMs and reloads HAProxy automatically on every renewal. The hook is five lines of bash. Certbot's DNS-01 challenge via Cloudflare has zero dependency on port 80, so swapping the web server is invisible to the renewal process.

    The backend config for the Kubernetes clusters uses server-template, which is HAProxy's way of saying "allocate N server slots, resolve this DNS name, and fill them dynamically":

    backend be_prod_envoy
        mode http
        balance roundrobin
        option tcp-check
        server-template tfx-prod 5 tfx-production.gerega.net:30080 \
            check inter 10s fall 2 rise 2 \
            resolvers localns resolve-prefer ipv4 init-addr none

    init-addr none tells HAProxy not to panic if DNS doesn't resolve at startup. It'll retry. The resolvers localns block points directly at my UCG Max DNS (192.168.60.1:53), bypassing systemd-resolved entirely.

    Phase 2: Validating Backends

    With HAProxy running on 8443, I curled every vhost:

    200  argo.mattgerega.net
    302  grafana.mattgerega.net
    200  cloud.mattgerega.net
    200  home.mattgerega.net
    ...

    A few things showed up immediately.

    The dead server. be_ha_test was showing down in the stats dashboard. This was a backend pointing at 192.168.1.115:8123—a Home Assistant test instance. 100% packet loss. Completely unreachable. The nginx config for ha-test.mattgerega.net had its IP restrictions commented out, so it was silently routing to a dead host this whole time.

    nginx had no idea. It was happily accepting requests for ha-test.mattgerega.net and forwarding them into the void.

    HAProxy told me in about 20 seconds of running.

    The fix: remove the backend entirely. The test instance is gone for good.

    The Garage backends. The Garage S3 API (port 3900) and web UI (port 3902) had no health checks at all—I'd just copied the structure from nginx without thinking. Added TCP checks to both. Garage S3 returns 403 on unauthenticated requests, and the web UI returns 404 at root, so HTTP checks would require widening the acceptable status range. TCP checks are simpler and answer the actual question: is the port open?

    The IP restrictions. My config denies certain backends to everything except specific subnets—internal cluster access is LAN-only, for example. Testing from the proxy server itself (192.168.60.x, the Services VLAN) correctly returned 403 for backends that don't allow the Services network. The ACLs were firing as designed.

    Phase 3: Cutover

    Three changes to the config:

    • bind *:8443bind *:443
    • bind *:8444bind *:5001 (Synology DSM passthrough)
    • Add back the port 80 frontend for HTTP→HTTPS redirects

    Then:

    sudo systemctl stop nginx
    sudo systemctl disable nginx
    sudo systemctl reload haproxy

    Total downtime: zero. HAProxy's reload is graceful—it forks a new worker with the new config while the old worker finishes its in-flight connections, then exits. No dropped requests.

    Smoke test from the proxy server itself confirmed HTTPS backends responding and HTTP redirect returning 301.

    What I Actually Got

    Immediate visibility. The stats dashboard at :9000 shows every backend, every server slot, whether each node is up or down, and live request rates. I can see at a glance that production has 2 active nodes, internal has 4, nonprod has 3. Before, I had no idea.

    Automatic failover. If a K8s worker node goes down, HAProxy stops sending it traffic within 20 seconds (2 failed checks at 10s intervals) without any intervention. nginx would have kept routing to it until I noticed and reloaded.

    The dead server caught. I had a backend pointing at a dead host for an unknown amount of time. Only found it because HAProxy's health check turned it red immediately.

    Dynamic node discovery. DNS re-resolution means adding a worker node to the cluster automatically makes it available to HAProxy. Remove a node gracefully, and it disappears from DNS, then from HAProxy's active pool.

    The Part That Surprised Me

    I expected the migration to be technically interesting but operationally boring. It was the opposite.

    The most valuable thing wasn't the health checking or the stats dashboard. It was running the parallel validation and discovering that ha-test.mattgerega.net had been routing to a dead host for who knows how long. nginx had never complained. Every request to that vhost was just… silently failing.

    That's the real argument for active health checking: it doesn't just prevent future failures. It surfaces the failures that are already happening that you don't know about yet.

    What's Next

    Phase 4 of the HAProxy migration plan covers routing K8s API server traffic (port 6443) through HAProxy via TCP passthrough with SNI-based cluster routing. Right now, kubeconfigs point directly at control plane nodes. Routing through HAProxy means active health checking for the API server too—if a control plane node goes down, kubectl automatically routes to a healthy one.

    That involves certificate SANs, RKE2 certificate rotation, and the usual amount of "this should be simple" complexity. Future post.


    Part 5 of the home network rebuild series. Read Part 4: Network Hops and Reverse Proxy Placement

  • The Best Infrastructure Migration Is the One You Forget About

    In December, I migrated all of my home lab clusters from NGINX Ingress to Envoy Gateway. I wrote about the process in Modernizing the Gateway, covering the phased rollout, the HTTPRoute conversions, and the cleanup of years of Traefik and NGINX configuration.

    Three months later, the most honest thing I can say about the migration is this: I forgot it happened.

    No incidents. No late-night debugging sessions. No "why is this service unreachable" Slack messages to myself. The traffic flows, TLS terminates, and routes resolve. Envoy Gateway has been completely invisible — and if you've ever operated a reverse proxy, you know that invisible is the highest compliment you can pay one.

    The Clock Ran Out

    This month — March 2026 — NGINX Ingress Controller is officially retired. No more releases, no security patches, no bug fixes. The Kubernetes SIG Network announcement from November 2024 made the recommendation clear: migrate to the Gateway API.

    If you're still running NGINX Ingress, you're now running unsupported software in your ingress path. That's the component handling every request into your cluster. It's not the place to carry technical debt.

    I got ahead of the deadline by about three months. Not because I'm unusually disciplined — I just happened to be in the middle of a network rebuild and the timing lined up. But even if it hadn't, the migration turned out to be straightforward enough that I'd have been comfortable doing it under pressure.

    Why This Migration Was Boring (In the Best Way)

    Not every migration I've written about on this blog has gone smoothly. I've had my share of "what do you mean the cluster won't come back up" moments. So what made this one different?

    The ecosystem was ready. Most of the Helm charts I use already supported Gateway API resources natively. I wasn't hand-rolling HTTPRoute manifests from scratch or writing custom templates — the charts had gateway configuration sections waiting to be enabled. When the tooling meets you where you are, migrations shrink from projects to tasks.

    The Gateway API is genuinely better. HTTPRoute is more expressive than Ingress. The explicit parentRefs model makes it immediately clear which gateway handles which route. No more guessing which ingress class annotation you need, or whether the controller will actually pick up your resource. The separation between infrastructure operators (who manage Gateways) and application developers (who manage Routes) maps cleanly to how I think about my own deployments, even as a team of one.

    Envoy is battle-tested. Envoy Gateway is the new part, but Envoy proxy has been handling production traffic at massive scale for years. I wasn't betting on an unproven proxy — I was adopting a new management layer for a proven one. The EnvoyProxy custom resource gives real control over proxy behavior without the annotation soup that NGINX Ingress required.

    What I'm Not Using

    Honesty requires mentioning the features I haven't touched. Envoy Gateway supports traffic splitting, request mirroring, and multi-cluster routing. I'm not using any of them.

    Multi-cluster traffic management is handled by Linkerd's multicluster extension in my setup, and it works well. I didn't migrate to Envoy Gateway to replace that — I migrated because my ingress controller was being retired and the Gateway API is the clear path forward.

    The advanced routing features are there if I need them. I don't yet. And I'd rather have capabilities on the shelf than be forced into a migration when I actually need them.

    The Honest Recommendation

    If you're still on NGINX Ingress: migrate now. Not next quarter, not when you "have time." The retirement is here, and the migration is genuinely not that bad.

    Here's what worked for me:

    1. Deploy Envoy Gateway alongside your existing ingress. Don't rip and replace. Run both, migrate routes one at a time.
    2. Start with your least critical service. Get comfortable with HTTPRoute syntax on something that won't page you.
    3. Check your Helm charts. You might be surprised how many already support Gateway API resources. The migration might be a values file change, not a manifest rewrite.
    4. Clean up after yourself. Once everything is migrated, remove the old ingress controller entirely. Don't leave it running "just in case." Dead configuration is how you end up with 124 devices on your network and no idea what half of them do.

    The whole process took me about a week across three clusters, and most of that was methodical testing rather than actual problem-solving.

    The Takeaway

    I've written nearly 200 posts on this blog, and a surprising number of them are migration stories. Some were painful. Some were educational. Some were both. This one was neither — and that's the best possible outcome.

    The goal of infrastructure isn't to be interesting. It's to be invisible. Envoy Gateway got out of the way and let me focus on the things that actually matter. Three months in, I have nothing to report. And sometimes, that's the best review I can give.

  • Simplifying Internal Routing

    Centralizing Telemetry with Linkerd Multi-Cluster

    Running multiple Kubernetes clusters is great until you realize your telemetry traffic is taking an unnecessarily complicated path. Each cluster had its own Grafana Alloy instance dutifully collecting metrics, logs, and traces—and each one was routing through an internal Nginx reverse proxy to reach the centralized observability platform (Loki, Mimir, and Tempo) running in my internal cluster.

    This worked, but it had that distinct smell of “technically functional” rather than “actually good.” Traffic was staying on the internal network (thanks to a shortcut DNS entry that bypassed Cloudflare), but why route through an Nginx proxy when the clusters could talk directly to each other? Why maintain those external service URLs when all my clusters are part of the same infrastructure?

    Linkerd multi-cluster seemed like the obvious answer for establishing direct cluster-to-cluster connections, but the documentation leaves a lot unsaid when you’re dealing with on-premises clusters without fancy load balancers. Here’s how I made it work.

    The Problem: Telemetry Taking the Scenic Route

    My setup looked like this:

    Internal cluster: Running Loki, Mimir, and Tempo behind an Nginx gateway

    Production cluster: Grafana Alloy sending telemetry to loki.mattgerega.net, mimir.mattgerega.net, etc.

    Nonproduction cluster: Same deal, different tenant ID

    Every metric, log line, and trace span was leaving the cluster, hitting the Nginx reverse proxy, and finally making it to the monitoring services—which were running in a cluster on the same physical network. The inefficiency was bothering me more than it probably should have.

    This meant:

    – An unnecessary hop through the Nginx proxy layer

    – Extra TLS handshakes that didn’t add security value between internal services

    – DNS resolution for external service names when direct cluster DNS would suffice

    – One more component in the path that could cause issues

    The Solution: Hub-and-Spoke with Linkerd Multi-Cluster

    Linkerd’s multi-cluster feature does exactly what I needed: it mirrors services from one cluster into another, making them accessible as if they were local. The service mesh handles all the mTLS authentication, routing, and connection management behind the scenes. From the application’s perspective, you’re just calling a local Kubernetes service.

    For my setup, a hub-and-spoke topology made the most sense. The internal cluster acts as the hub—it runs the Linkerd gateway and hosts the actual observability services (Loki, Mimir, and Tempo). The production and nonproduction clusters are spokes—they link to the internal cluster and get mirror services that proxy requests back through the gateway.

    The beauty of this approach is that only the hub needs to run a gateway. The spoke clusters just run the service mirror controller, which watches for exported services in the hub and automatically creates corresponding proxy services locally. No complex mesh federation, no VPN tunnels, just straightforward service-to-service communication over mTLS.

    Gateway Mode vs. Flat Network

    (Spoiler: Gateway Mode Won)

    Linkerd offers two approaches for multi-cluster communication:

    Flat Network Mode: Assumes pod networks are directly routable between clusters. Great if you have that. I don’t. My three clusters each have their own pod CIDR ranges with no interconnect.

    Gateway Mode: Routes cross-cluster traffic through a gateway pod that handles the network translation. This is what I needed, but it comes with some quirks when you’re running on-premises without a cloud load balancer.

    The documentation assumes you’ll use a LoadBalancer service type, which automatically provisions an external IP. On-premises? Not so much. I went with NodePort instead, exposing the gateway on port 30143.

    The Configuration: Getting the Helm Values Right

    Here’s what the internal cluster’s Linkerd multi-cluster configuration looks like:

    linkerd-multicluster:
      gateway:
        enabled: true
        port: 4143
        serviceType: NodePort
        nodePort: 30143
        probe:
          port: 4191
          nodePort: 30191
    
      # Grant access to service accounts from other clusters
      remoteMirrorServiceAccountName: linkerd-service-mirror-remote-access-production,linkerd-service-mirror-remote-access-nonproduction

    And for the production/nonproduction clusters:

    linkerd-multicluster:
      gateway:
        enabled: false  # No gateway needed here
    
      remoteMirrorServiceAccountName: linkerd-service-mirror-remote-access-in-cluster-local

    The Link: Connecting Clusters Without Auto-Discovery

    Creating the cluster link was where things got interesting. The standard command assumes you want auto-discovery:

    linkerd multicluster link --cluster-name internal --gateway-addresses internal.example.com:30143

    But that command tries to do DNS lookups on the combined hostname+port string, which fails spectacularly. The fix was simple once I found it:

    linkerd multicluster link \
      --cluster-name internal \
      --gateway-addresses tfx-internal.gerega.net \
      --gateway-port 30143 \
      --gateway-probe-port 30191 \
      --api-server-address https://cp-internal.gerega.net:6443 \
      --context=internal | kubectl apply -f - --context=production

    Separating --gateway-addresses and --gateway-port made all the difference.

    I used DNS (tfx-internal.gerega.net) instead of hard-coded IPs for the gateway address. This is an internal DNS entry that round-robins across all agent node IPs in the internal cluster. The key advantage: when I cycle nodes (stand up new ones and destroy old ones), the DNS entry is maintained automatically. No manual updates to cluster links, no stale IP addresses, no coordination headaches—the round-robin DNS just picks up the new node IPs and drops the old ones.

    Service Export: Making Services Visible Across Clusters

    Linkerd doesn’t automatically mirror every service. You have to explicitly mark which services should be exported using the mirror.linkerd.io/exported: "true" label.

    For the Loki gateway (and similarly for Mimir and Tempo):

    gateway:
      service:
        labels:
          mirror.linkerd.io/exported: "true"

    Once the services were exported, they appeared in the production and nonproduction clusters with an `-internal` suffix:

    loki-gateway-internal.monitoring.svc.cluster.local

    mimir-gateway-internal.monitoring.svc.cluster.local

    tempo-gateway-internal.monitoring.svc.cluster.local

    Grafana Alloy: Switching to Mirrored Services

    The final piece was updating Grafana Alloy’s configuration to use the mirrored services instead of the external URLs. Here’s the before and after for Loki:

    Before:

    loki.write "default" {
      endpoint {
        url = "https://loki.mattgerega.net/loki/api/v1/push"
        tenant_id = "production"
      }
    }

    After:

    loki.write "default" {
      endpoint {
        url = "http://loki-gateway-internal.monitoring.svc.cluster.local/loki/api/v1/push"
        tenant_id = "production"
      }
    }

    No more TLS, no more public DNS, no more reverse proxy hops. Just a direct connection through the Linkerd gateway.

    But wait—there’s one more step.

    The Linkerd Injection Gotcha

    Grafana Alloy pods need to be part of the Linkerd mesh to communicate with the mirrored services. Without the Linkerd proxy sidecar, the pods can’t authenticate with the gateway’s mTLS requirements.

    This turned into a minor debugging adventure because I initially placed the `podAnnotations` at the wrong level in the Helm values. The Grafana Alloy chart is a wrapper around the official chart, which means the structure is:

    alloy:
      controller:  # Not alloy.alloy!
        podAnnotations:
          linkerd.io/inject: enabled
      alloy:
        # ... other config

    Once that was fixed and the pods restarted, they came up with 3 containers instead of 2:

    – `linkerd-proxy` (the magic sauce)

    – `alloy` (the telemetry collector)

    – `config-reloader` (for hot config reloads)

    Checking the gateway logs confirmed traffic was flowing:

    INFO inbound:server:gateway{dst=loki-gateway.monitoring.svc.cluster.local:80}: Adding endpoint addr=10.42.5.4:8080
    INFO inbound:server:gateway{dst=mimir-gateway.monitoring.svc.cluster.local:80}: Adding endpoint addr=10.42.9.18:8080
    INFO inbound:server:gateway{dst=tempo-gateway.monitoring.svc.cluster.local:4317}: Adding endpoint addr=10.42.10.13:4317

    Known Issues: Probe Health Checks

    There’s one quirk worth mentioning: the multi-cluster probe health checks don’t work in NodePort mode. The service mirror controller tries to check the gateway’s health endpoint and reports it as unreachable, even though service mirroring works perfectly.

    From what I can tell, this is because the health check endpoint expects to be accessed through the gateway service, but NodePort doesn’t provide the same service mesh integration as a LoadBalancer. The practical impact? None. Services mirror correctly, traffic routes successfully, mTLS works. The probe check just complains in the logs.

    What I Learned

    1. Gateway mode is essential for non-routable pod networks. If your clusters don’t have a CNI that supports cross-cluster routing, gateway mode is the way to go.

    2. NodePort works fine for on-premises gateways. You don’t need a LoadBalancer if you’re willing to manage DNS.

    3. DNS beats hard-coded IPs. Using `tfx-internal.gerega.net` means I can recreate nodes without updating cluster links.

    4. Service injection is non-negotiable. Pods must be part of the Linkerd mesh to access mirrored services. No injection, no mTLS, no connection.

    5. Helm values hierarchies are tricky. Always check the chart templates when podAnnotations aren’t applying. Wrapper charts add extra nesting.

    The Result

    Telemetry now flows directly from production and nonproduction clusters to the internal observability stack through Linkerd’s multi-cluster gateway—all authenticated via mTLS, bypassing the Nginx reverse proxy entirely.

    I didn’t reduce the number of monitoring stacks (each cluster still runs Grafana Alloy for collection), but I simplified the routing by using direct cluster-to-cluster connections instead of going through the Nginx proxy layer. No more proxy hops. No more external service DNS. Just three Kubernetes clusters talking to each other the way they should have been all along.

    The full configuration is in the ops-argo and ops-internal-cluster repositories, managed via ArgoCD ApplicationSets. Because if there’s one thing I’ve learned, it’s that GitOps beats manual kubectl every single time.

  • Modernizing the Gateway

    From NGINX Ingress to Envoy Gateway

    As with any good engineer, I cannot leave well enough alone. Over the past week, I’ve been working through a significant infrastructure modernization across my home lab clusters – migrating from NGINX Ingress to Envoy Gateway and implementing the Kubernetes Gateway API. This also involved some necessary housekeeping with chart updates and a shift to Server-Side Apply for all ArgoCD-managed resources.

    Why Change?

    The timing couldn’t have been better. In November 2024, the Kubernetes SIG Network and Security Response Committee announced that Ingress NGINX will be retired in March 2026. The project has struggled with insufficient maintainer support, security concerns around configuration snippets, and accumulated technical debt. After March 2026, there will be no further releases, security patches, or bug fixes.

    The announcement strongly recommends migrating to the Gateway API, described as “the modern replacement for Ingress.” This validated what I’d already been considering – the Gateway API provides a more standardized, vendor-neutral approach with better separation of concerns between infrastructure operators and application developers.

    Envoy Gateway, being a CNCF project built on the battle-tested Envoy proxy, seemed like a natural choice for this migration. Plus, it gave me an excuse to finally move off Traefik, which was… well, let’s just say it was time for a change.

    The Migration Journey

    The migration happened in phases across my ops-argoops-prod-cluster, and ops-nonprod-cluster repositories. Here’s what changed:

    Phase 1: Adding Envoy Gateway

    I started by adding Envoy Gateway as a cluster tool, complete with its own ApplicationSet that deploys to clusters labeled with spydersoft.io/envoy-gateway: "true". The deployment includes:

    • GatewayClass and Gateway resources: Defined a main gateway that handles traffic routing
    • EnvoyProxy configuration: Set up with a static NodePort service for consistent external access
    • ClientTrafficPolicy: Configured to properly handle forwarded headers – crucial for preserving client IP information through the proxy chain

    The Envoy Gateway deployment lives in the envoy-gateway-system namespace and exposes services via NodePort 30080 and 30443, making it easy to integrate with my existing network setup.

    Phase 2: Migrating Applications to HTTPRoute

    This was the bulk of the work. Each application needed its Ingress resource replaced with an HTTPRoute. The new Gateway API resources are much cleaner. For example, my blog (www.mattgerega.com) went from an Ingress definition to this:

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: wp-mattgerega
      namespace: sites
    spec:
      parentRefs:
        - name: main
          namespace: envoy-gateway-system
      hostnames:
        - www.mattgerega.com
      rules:
        - matches:
            - path:
                type: PathPrefix
                value: /
          backendRefs:
            - name: wp-mattgerega-wordpress
              port: 80
    

    Much more declarative and expressive than the old Ingress syntax.

    I migrated several applications across both production and non-production clusters:

    • Gravitee API Management
    • ProGet (my package management system)
    • n8n and Node-RED instances
    • Linkerd-viz dashboard
    • ArgoCD (which also got a GRPCRoute for its gRPC services)
    • Identity Server (across test and stage environments)
    • Tech Radar
    • Home automation services (UniFi client and IP manager)

    Phase 3: Removing the Old Guard

    Once everything was migrated and tested, I removed the old ingress controller configurations. This cleanup happened across all three repositories:

    ops-prod-cluster:

    • Removed all Traefik configuration files
    • Cleaned up traefik-gateway.yaml and traefik-middlewares.yaml

    ops-nonprod-cluster:

    • Removed Traefik configurations
    • Deleted the RKE2 ingress NGINX HelmChartConfig (rke2-ingress-nginx-config.yaml)

    The cluster-resources directories got significantly cleaner with this cleanup. Good riddance to configuration sprawl.

    Phase 4: Chart Maintenance and Server-Side Apply

    While I was in there making changes, I also:

    • Bumped several Helm charts to their latest versions:
      • ArgoCD: 9.1.5 → 9.1.7
      • External Secrets: 1.1.0 → 1.1.1
      • Linkerd components: 2025.11.3 → 2025.12.1
      • Grafana Alloy: 1.4.0 → 1.5.0
      • Common chart dependency: 4.4.0 → 4.5.0
      • Redis deployments updated across production and non-production
    • Migrated all clusters to use Server-Side Apply (ServerSideApply=true in the syncOptions):
      • All cluster tools in ops-argo
      • Production application sets (external-apps, production-apps, cluster-resources)
      • Non-production application sets (external-apps, cluster-resources)

    This is a better practice for ArgoCD as it allows Kubernetes to handle three-way merge patches instead of client-side strategic merge, reducing conflicts and improving sync reliability.

    Lessons Learned

    Gateway API is ready for production: The migration was surprisingly smooth. The Gateway API resources are well-documented and intuitive. With NGINX Ingress being retired, now’s the time to make the jump.

    HTTPRoute vs. Ingress: HTTPRoute is more expressive and allows for more sophisticated routing rules. The explicit parentRefs concept makes it clear which gateway handles which routes.

    Server-Side Apply everywhere: Should have done this sooner. The improved conflict handling makes ArgoCD much more reliable, especially when multiple controllers touch the same resources.

    Envoy’s configurability: The EnvoyProxy custom resource gives incredible control over the proxy configuration without needing to edit ConfigMaps or deal with annotations.

    Multi-cluster consistency: Making these changes across production and non-production environments simultaneously kept everything aligned and reduced cognitive overhead when switching between environments.

    Current Status

    All applications across all clusters are now running through Envoy Gateway with the Gateway API. Traffic is flowing correctly, TLS is terminating properly, and I’ve removed all the old ingress-related configuration from both production and non-production environments.

    The clusters are more standardized, the configuration is cleaner, and I’m positioned to take advantage of future Gateway API features like traffic splitting and more advanced routing capabilities. More importantly, I’m ahead of the March 2026 retirement deadline with plenty of time to spare.

    Now, the real question: what am I going to tinker with next?

  • ArgoCD panicked a little…

    I ran into an odd situation last week with ArgoCD, and it took a bit of digging to figure it out. Hopefully this helps someone else along the way.

    Whatever you do, don’t panic!

    Well, unless of course you are ArgoCD.

    I have a small Azure DevOps job that runs nightly and attempts to upgrade some of the Helm charts that I use to deploy external tools. This includes things like Grafana, Loki, Mimir, Tempo, ArgoCD, External Secrets, and many more. This job deploys the changes to my GitOps repositories, and if there are changes, I can manually sync.

    Why not auto-sync, you might ask? Visibility, mostly. I like to see what changes are being applied, in case there is something bigger in the changes that needs my attention. I also like to “be there” if something breaks, so I can rollback quickly.

    Last week, while upgrading Grafana and Tempo, ArgoCD started throwing the following error on sync:

    Recovered from panic: runtime error: invalid memory address or nil pointer

    A quick trip to Google produced a few different results, but nothing immediately apparent. One particular issue mentioned that they had a problem with out-of-date resources (old apiversion). Let’s put a pin in that.

    Nothing was jumping out, and my deployments were still working. I had a number of other things on my plate, so I let this slide for a few days.

    Versioning….

    When I finally got some time to dig into this, I figured I would pull at that apiversion string and see what shook loose. Unfortunately, as there is no real good error as to which resource is causing it, it was luck of the draw as to whether or not I found the offender. This time, I was lucky.

    My ExternalSecret resources were using some alpha versions, so my first thought was to update to the v1 version. Lowe and behold, that fixed the two charts which were failing.

    This, however, leads to a bigger issue: if ArgoCD is not going to inform me when I have out of date apiversion values for a resource, I am going to have to figure out how to validate these resources sometime before I commit the changes. I’ll put this on my ever growing to do list.

  • Moving to Ubuntu 24.04

    I have a small home lab running a few Kubernetes clusters, and a good bit of automation to deal with provisioning servers for the K8 clusters. All of my Linux VMs are based on Ubuntu 22.04. I prefer to stick with LTS for stability and compatibility.

    As April turns into July (missed some time there), I figured Ubuntu’s latest LTS (24.04) has matured to the point that I could start the process of updating my VMs to the new version.

    Easier than Expected

    In my previous move from 20.04 to 22.04, there were some changes to the automated installers for 22.04 that forced me down the path of testing my packer provisioning with the 22.04 ISOs. I expected similar changes with 24.04. I was pleasantly surprised when I realized that my existing scripts should work well with the 24.04 ISOs.

    I did spend a little time updating the Azure DevOps pipeline that builds a base image so that it supports building both a 22.04 and 24.04 image. I want to make sure I have the option to use the 22.04 images, should I find a problem with 24.04

    Migrating Cluster Nodes

    With a base image provisioned, I followed my normal process for upgrading cluster nodes on my non-production cluster. There were a few hiccups, mostly around some of my automated scripts that needed to have the appropriate settings to set hostnames correctly.

    Again, other than some script debugging, the process worked with minimal changes to my automation scripts and my provisioning projects.

    Azure DevOps Build Agent?

    Perhaps in a few months. I use the GitHub runner images as a base for my self-hosted agents, but there are some changes that need manual review. I destroy my Azure DevOps build agent weekly and generate a new one, and that’s a process that I need to make sure continues to work through any changes.

    The issue is typically time: the build agents take a few hours to provision because of all the tools that are installed. Testing that takes time, so I have to plan ahead. Plus, well, it is summertime, and I’d much rather be in the pool than behind the desk.

  • Automating Grafana Backups

    After a few data loss events, I took the time to automate my Grafana backups.

    A bit of instability

    It has been almost a year since I moved to a MySQL backend for Grafana. In that year, I’ve gotten a corrupted MySQL database twice now, forcing me to restore from a backup. I’m not sure if it is due to my setup or bad luck, but twice in less than a year is too much.

    In my previous post, I mentioned the Grafana backup utility as a way to preserve this data. My short-sightedness prevented me from automating those backups, however, so I suffered some data loss. After the most recent event, I revisited the backup tool.

    Keep your friends close…

    My first thought was to simply write a quick Azure DevOps pipeline to pull the tool down, run a backup, and copy it to my SAN. I would have also had to have included some scripting to clean up old backups.

    As I read through the grafana-backup-tool documents, though, I came across examples of running the tool as a Job in Kubernetes via a CronJob. This presented a very unique opportunity: configure the backup job as part of the Helm chart.

    What would that look like? Well, I do not install any external charts directly. They are configured as dependencies for charts of my own. Now, usually, that just means a simple values file that sets the properties on the dependency. In the case of Grafana, though, I’ve already used this functionality to add two dependent charts (Grafana and MySQL) to create one larger application.

    This setup also allows me to add additional templates to the Helm chart to create my own resources. I added two new resources to this chart:

    1. grafana-backup-cron – A definition for the cronjob, using the ysde/grafana-backup-tool image.
    2. grafana-backup-secret-es – An ExternalSecret definition to pull secrets from Hashicorp Vault and create a Secret for the job.

    Since this is all built as part of the Grafana application, the secrets for Grafana were already available. I went so far as to add a section in the values file for the backup. This allowed me to enable/disable the backup and update the image tag easily.

    Where to store it?

    The other enhancement I noticed in the backup tool was the ability to store files in S3 compatible storage. In fact, their example showed how to connect to a MinIO instance. As fate would have it, I have a MinIO instance running on my SAN already.

    So I configured a new bucket in my MinIO instance, added a new access key, and configured those secrets in Vault. After committing those changes and synchronizing in ArgoCD, the new resources were there and ready.

    Can I test it?

    Yes I can. Google, once again, pointed me to a way to create a Job from a CronJob:

    kubectl create job --from=cronjob/<cronjob-name> <job-name> -n <namespace-name>

    I ran the above command to create a test job. And, viola, I have backup files in MinIO!

    Cleaning up

    Unfortunately, there doesn’t seem to be a retention setting in the backup tool. It looks like I’m going to have to write some code to clean up my Grafana backups bucket, especially since I have daily backups scheduled. Either that, or look at this issue and see if I can add it to the tool. Maybe I’ll brush off my Python skills…

  • My Introduction to Kubernetes NetworkPolicy

    The Bitnami Redis Helm chart has thrown me a curve ball over the last week or so, and made me look at Kubernetes NetworkPolicy resources.

    Redis Chart Woes

    Bitnami seems to be updating their charts to include default NetworkPolicy resources. While I don’t mind this, a jaunt through their open issues suggests that it has not been a smooth transition.

    The redis chart’s initial release of NetworkPolicy objects broke the metrics container, since the default NetworkPolicy didn’t add the metrics port to allowed ingress ports.

    So I sat on the old chart until the new Redis chart was available.

    And now, Connection Timeouts

    Once the update was released, I rolled out the new version of Redis. The containers came up, and I didn’t really think twice about it. Until, that is, I decided to do some updates to both my applications and my Kubernetes nodes.

    I upgraded some of my internal applications to .Net 8. This caused all of them to restart, and, in the process, get their linkerd-proxy sidecars running. I also started cycling the nodes on my internal cluster. When it came time to call my Unifi IP Manager API to delete an old assigned IP, I got an internal server error.

    A quick check of the logs showed that the pod’s Redis connection was failing. Odd, I thought, since most other connections have been working fine, at least through last week.

    After a few different Google searches, I came across this section in the Linkerd.io documentation. As it turns out, when you use NetworkPolicy resources and opaque ports (like Redis), you have to make sure that Linkerd’s inbound port (which defaults to 4143) is also setup in the NetworkPolicy.

    Adding the Linkerd port to the extraIngress section in the Redis Helm chart worked wonders. With that section in place, connectivity was restored and I could go about my maintenance tasks.

    NetworkPolicy for all?

    Maybe. This is my first exposure to them, so I would like to understand how they operate and what best practices are for such things. In the meantime, I’ll be a little more wary when I see NetworkPolicy resources pop up in external charts.

  • A Tale of Two Proxies

    I am working on building a set of small reference applications to demonstrate some of the patterns and practices to help modernize cloud applications. In configuring all of this in my home lab, I spent at least 3 hours fighting a problem that turned out to be a configuration issue.

    Backend-for-Frontend Pattern

    I will get into more details when I post the full application, but I am trying to build out a SPA with a dedicated backend API that would host the SPA and take care of authentication. As is typically the case, I was able to get all of this working on my local machine, including the necessary proxying of calls via the SPA’s development server (again, more on this later).

    At some point, I had two containers ready to go: a BFF container hosting the SPA and the dedicated backend, and an API container hosting a data service. I felt ready to deploy to the Kubernetes cluster in my lab.

    Let the pain begin!

    I have enough samples within Helm/Helmfile that getting the items deployed was fairly simple. After fiddling with the settings of the containers, things were running well in the non-authenticated mode.

    However, when I clicked login, the following happened:

    1. I was redirected to my oAuth 2.0/OIDC provider.
    2. I entered my username/password
    3. I was redirected back to my application
    4. I got a 502 Bad Gateway screen

    502! But, why? I consulted Google and found any number of articles indicating that, in the authentication flow, Nginx’s default header size limit is too small to limit what might be coming back from the redirect. So, consulting the Nginx configuration documents, I changed the Nginx configuration in my reverse proxy to allow for larger headers.

    No luck. Weird. In the spirit of true experimentation (change one thing at a time), I backed those changes out and tried changing the configuration of my Nginx Ingress controller. No luck. So what’s going on?

    Too Many Cooks

    My current implementation looks like this:

    flowchart TB
        A[UI] --UI Request--> B(Nginx Reverse Proxy)
        B --> C("Kubernetes Ingress (Nginx)")
        C --> D[UI Pod]
    

    There are two Nginx instances between all of my traffic: an instance outside of the cluster that serves as my reverse proxy, and an Nginx ingress controller that serves as the reverse proxy within the cluster.

    I tried changing both separately. Then I tried changing both at the same time. And I was still seeing this error. As it turns out, well, I was being passed some bad data as well.

    Be careful what you read on the Internet

    As it turns out, the issue was the difference in configuration between the two Nginx instances and some bad configuration values that I got from old internet articles.

    Reverse Proxy Configuration

    For the Nginx instance running on Ubuntu, I added the following to my nginx.conf file under the http section:

            proxy_buffers 4 512k;
            proxy_buffer_size 256k;
            proxy_busy_buffers_size 512k;
            client_header_buffer_size 32k;
            large_client_header_buffers 4 32k;

    Nginx Ingress Configuration

    I am running RKE2 clusters, so configuring Nginx involves a HelmChartConfig resource being created in the kube-system namespace. My cluster configuration looks like this:

    apiVersion: helm.cattle.io/v1
    kind: HelmChartConfig
    metadata:
      name: rke2-ingress-nginx
      namespace: kube-system
    spec:
      valuesContent: |-
        controller:
          kind: DaemonSet
          daemonset:
            useHostPort: true
          config:
            use-forwarded-headers: "true"
            proxy-buffer-size: "256k"
            proxy-buffers-number: "4"
            client-header-buffer-size: "256k"
            large-client-header-buffers: "4 16k"
            proxy-body-size: "10m"

    The combination of both of these settings got my redirects to work without the 502 errors.

    Better living through logging

    One of the things I fought with on this was finding the appropriate logs to see where the errors were occurring. I’m exporting my reverse proxy logs into Loki using a Promtail instance that listens on a syslog port. So I am “getting” the logs into Loki, but I couldn’t FIND them.

    I forgot about the facility in syslog: I have the access logs sending as local5, but did configured the error logs without pointing them to local5. I learned that, by default, they go to local7.

    Once I found the logs I was able to diagnose the issue, but I spent a lot of time browsing in Loki looking for those logs.

  • Re-configuring Grafana Secrets

    I recently fixed some synchronization issues that had been silently plaguing some of the monitoring applications I had installed, including my Loki/Grafana/Tempo/Mimir stack. Now that the applications are being updated, I ran into an issue with the latest Helm chart’s handling of secrets.

    Sync Error?

    After I made the change to fix synchronization of the Helm charts, I went to sync my Grafana chart, but received a sync error:

    Error: execution error at (grafana/charts/grafana/templates/deployment.yaml:36:28): Sensitive key 'database.password' should not be defined explicitly in values. Use variable expansion instead.

    I certainly didn’t change anything in those files, and I am already using variable expansion in the values.yaml file anyway. What does that mean? Basically, in the values.yaml file, I used ${ENV_NAME} in areas where I had a secret value, and told Grafana to expand environment variables into the configuration.

    The latest version of Helm doesn’t seem to like this. It views ANY value in secret fields to be bad. A search of the Grafana Helm Chart repo’s issues list yielded someone with a similar issue and a comment with a link to another comment that is the recommended solution.

    Same Secret, New Name

    After reading through the comment’s suggestion and Grafana’s documentation on overriding configuration with environment variables, I realized the fix was pretty easy.

    I already had a Kubernetes secret being populated from Hashicorp Vault with my secret values. I also already had envFromSecret set in the values.yaml to instruct the chart to use my secret. And, through some dumb luck, two of the three values were already named using the standards in Grafana’s documentation.

    So the “fix” was to simply remove the secret expansions from the values.yaml file, and rename one of the secretKey values so that it matched Grafana’s environment variable template. You can see the diff of the change in my Github repository.

    With that change, the Helm chart generated correctly, and once Argo had the changes in place, everything was up and running.