• The trip to Windows 11

    The past five weeks have been a blur. Spring soccer is in full swing, and my time at the keyboard has been limited to mostly work. A PC upgrade at work started a trickle-down upgrade, and with things starting to settle it’s probably worth a few notes.

    Swapping out the work laptop

    My old work laptop, a Dell Precision 7510, was about a year over our typical 4 year cycle on hardware. I am usually not one to swap right at 4 years: if the hardware is working for me, I’ll run it till it dies.

    Well, I was clearly killing the 7510: Every so often I would get random shutdowns that were triggered by the thermal protection framework. In other words, to quote Ricky Bobby, “I’m on fire!!!” As it had become more frequent, I put in a request for a new laptop. Additionally, since my 7510 was so old, it couldn’t be redistributed, so I requested to buy it back. My personal laptop, and HP Envy 17, is approaching the 11 year mark, so I believe I’ve used it enough.

    I was provisioned a Dell Precision 7550. As it came to me clean with Windows 10, I figured I’d upgrade it to Windows 11 before I reinstalled my apps and tools. That way, if I broke it, I would know before I wasted my time. The upgrade was pretty easy, and outside of some random issues with MS Teams and audio setup, I was back at work with minimal disruption.

    On to the “new old” laptop

    My company approved the buyback of my old Precision 7510, but I did have to ship it back to have it wiped and prepped. Once I got it back, I turned it on and….. boot media failure.

    Well, crap. As it turns out, the M.2 SSD died on the 7510. So, off to Amazon to pick up a 1TB replacement. A day later, new M.2 in hand, I was off to the races.

    I put in my USB drive with Windows 11 media, booted, and got the wonderful message that my hardware did not meet requirements. As it turns out, my TPM module was an old firmware version (1.2 instead of the required 2.0), and I was running an old processor that is not in the officially supported list for Windows 11.

    So I installed Windows 10 and tried to boot, but, well, that just failed. As it turns out, the BIOS was setup for legacy boot and insecure boot, which I needed for Windows 11. And, after changing the BIOS, my installed version of Windows 10 wouldn’t boot. I suppose I could have taken the time to work on the boot loader to get it working… but I just re-installed Windows 10 again.

    So after the second install of Windows 10, I was able to update the TPM Firmware to 2.0, bypass the CPU check, and install Windows 11. I started installing my standard library of tools, and things seemed great.

    Still on fire

    Windows 11, however, seems to have exacerbated the overheating issue. It came to a head when I tried to install Jetbrains Rider: every install caused the machine to overheat.

    I found a few different solutions in separate forums, but the working solution was to disable the “Turbo Boost” setting in the BIOS. My assumption is that is some sort of overclocking, but removing that setting has stabilized my system.

    Impressions

    So far, so good with Windows 11. Of all the changes, the ability to match contents of the taskbar to the monitor in a multi-monitor display is great, but I am still getting used to that functionality. Organizationally it is accurate, but muscle memory is hard to break.

  • GitOps: Moving from Octopus to … Octopus?

    Well, not really, but I find it a bit cheeky that ArgoCD’s icon is, in fact, an orange octopus.

    There are so many different ways to provision and run Kubernetes clusters that, without some sense of standardization across the organization, Kubernetes can become an operations nightmare. And while a well-run operations environment can allow application developers to focus on their applications, a lack of standards and best practices for container orchestration can divert resources from application development to figuring out how a particular cluster was provisioned or deployed.

    I have spent a good bit of time at work trying to consolidate our company approach to Kubernetes. Part of this is working with our infrastructure teams to level set on what a “bare” cluster looks like. The other part is standardizing how the clusters are configured and managed. For this, a colleague turned me onto GitOps.

    GitOps – Declarative Management

    I spent a LOT of time just trying to understand how all of this works. For me, the thought of “well, in order to deploy I have to change a Git repo” seemed, well, icky. I like the idea that code changes, things are built, tested, and deployed. I had a difficult time separating application code from declarative state. Once I removed that mental block, however, I was bound and determined to move my home environments from Octopus Deploy pipelines to ArgoCD.

    I did not go on this journey alone. My colleague Oleg introduced me to ArgoCD and the External Secrets operator. I found a very detailed article with flow details by Florian Dambrine on Medium.com. Likewise, Burak Kurt showed me how to have Argo manage itself. Armed with all of this information, I started my journey.

    The “short short” version

    In an effort to retain the few readers I have, I will do my best to consolidate all of this into a few short paragraphs. I have two types of applications in my clusters: external tools and home-grown apps.

    External Tools

    Prior to all of this, my external tools were managed by running Helm upgrades periodically with values files saved to my PC. These tools included “cluster tools” which are installed on every cluster, and then things like WordPress and Proget for running the components of this site. Needless to say, this is highly dangerous: had I lost my PC, all those values files would be gone.

    I now have tool definitions stored either in the cluster’s specific Git repo, or in a special “cluster-tools” repository that allows me to define applications that I want installed to all (or some) of my clusters, based on labels in the cluster’s Argo secret. This allows me to update tools by updating the version in my Git repository and committing the change.

    It should be noted that, for these tools, Helm is still used to install/upgrade. More on this later.

    Home Grown Applications

    The home-grown apps had more of a development feel: feature branches push to a test environment, while builds from main get pushed to staging and then, upon my approval in Azure DevOps, pushed to production.

    Previous to conversion, every build produced a new container image and Helm chart, both of which were published to Proget. From there, Octopus Deploy took care of deploying feature builds to the test environment only, and took care of deploying to stage and production based on nudges from Azure DevOps Tasks.

    Using Florian’s described flow, I created a Helmfile repo for each of my projects, which allowed me consolidate the application charts into a single repository. Using Helmfile and Helm, I generate manifests that are then committed into the appropriate cluster’s Git repository. Each Helmfile repository has its own pipeline for generating manifests and committing them to the cluster repository, but my individual project pipelines have gotten very simple: build a new version, and change the Helmfile repository to reflect the new image version.

    Once the cluster’s repository is updated, Argo takes care of syncing the resources.

    Helm versus Manifests

    I noted that I’m currently using Helm for external tools versus generating manifests (albeit from Helm charts, but still generating manifests) for internal applications. As long as you never actually run a helm install command, then Argo will manage the application using manifests generated from the Helm application. However, from what I have seen, if you have previously run helm install, that application hangs around in the cluster. However, that applications history doesn’t change with new versions. So you can get into an odd state where helm list will show older versions than what are actually installed.

    When using a tool like ArgoCD, you want to let it manage your applications from beginning to end. It will keep your cluster cleaner. For the time being, I am defining external tools using Helm templates, but using Helmfile to expand my home-built applications into manifest files.

  • Tech Tip – Turn on forwarded headers in Nginx

    I have been using Nginx as a reverse proxy for some time. In the very first iteration of my home lab, it lived on a VM and allowed me to point my firewall rules to a single target, and then route traffic from there. It has since been promoted to a dedicated Raspberry Pi in my fight with the network gnomes.

    My foray into Kubernetes in the home lab has brought Nginx in as an ingress controller. While there are many options for ingress, Nginx seems the most prevalent and, in my experience, the easiest to standardize on across a multitude of Kubernetes providers. As we drive to define what a standard K8 cluster looks like across our data centers and public cloud providers, Nginx seemed like a natural choice for our ingress provider.

    Configurable to a fault

    The Nginx Ingress controller is HIGHLY configurable. There are cluster-wide configuration settings that can be controlled through ConfigMap entries. Additionally, annotations can be used on specific ingress objects to control behavior on individual ingress.

    As I worked with one team to setup Duende’s Identity Server, we started running into issues with the identity server endpoints using http instead of https in its discovery endpoints (such as /.well-known/openid-configuration. Most of our research suggest that the X-Forwarded-* headers needed to be configured (which we did), but we were still seeing the wrong scheme in those endpoints.

    It was a weird problem: I had never run into this issue in my own Identity Server instance, which is running in my home Kubernetes environment. I figured it had to do with an Nginx setting, but had a hard time figuring out which one.

    One blog post pointed me in the right direction. Our Nginx ingress install did not have the use-forwarded-headers setting configured in the ConfigMap, which meant the X-Forwarded-* headers were not being passed to the pod. A quick change of our deployment project, and the openid-configuration endpoint returned the appropriate schemes.

    For reference, we are using the ingress-nginx helm chart. Adding the following to our values file solved the issue:

    controller:
      replicaCount: 2
      
      service:
        ... several service settings
      config:
        use-forwarded-headers: "true"

    Investigation required

    What I do not yet know is, whether or not I randomly configured this at home and just forgot about it, or if it is a default of the Rancher Kubernetes Engine (RKE) installer. I use RKE at home to stand up my clusters, and one of the add-ons I have it configure is ingress with Nginx. Either I have settings in my RKE configuration to forward headers or it’s a default of RKE…. Unfortunately, I am at a soccer tournament this weekend, so the investigation will have to wait until I get home.

    Update:

    Apparently I did know about use-forwarded-headers earlier: it was part of the options I had set in my home Kubernetes clusters. One of many things I have forgotten.

  • A blast from the past and a new path forward

    Over the last few years, the pandemic has thrown my eldest son’s college search for a bit of a loop. It’s difficult to talk about visiting college campuses when colleges are just trying to figure out how to keep there current students in the classroom. With that in mind, much of his search has been virtual.

    As campuses open up and graduation looms, though, we have had the opportunity to set up some visits with his top choice schools. One of them, to my great pride, is my alma mater, Allegheny College. So I spent my President’s Day, an unseasonably warm and sunny February day, walking the paths and hallways of Allegheny.

    It was a weird experience.

    There is no way that I can say that my experience at Allegheny defined who I am today. I am a product of 40+ years experience across a variety of schools, companies, organizations, and relationships, and to categorize the four years of my college experience as the defining years of my life would be unfair to those other experiences. But, my four years at Allegheny were a unique chapter of in my life, one that encouraged a level of self-awareness and helped me learn to interact with the world around me.

    For me, the college experience was less about the education and more about the life experience. That’s not to say I did not learn anything at Allegheny, far from it. But the situations and experiences that I was in forced me to learn more than just what was in my textbooks.

    What kinds of experiences? Carrying a job in residence life for two years as an advisor and director taught me a lot about team work, leadership, and dealing with people on a day to day basis. Fraternity life and Panhellenic/Interfraternity Council taught me a good deal about small group politics and the power of persuasion. Campus life, in general, gave me the opportunity to learn to be an adult in a much safer environment than the real world tends to offer an 18 year old high school graduate.

    College is not for everyone. Considering that news stories like this one pop up in my LinkedIn feed pretty regular is a testament to the change in perspective on a four year degree. What caught my eye from that article, however, is that there is some research to suggest that some schools are, in fact, better than others. It begs the question, is college right for my son, a prospective computer science student?

    Being back on campus with a prospective Computer Science student allowed me to get a look at what Allegheny’s CS department is doing to prepare its students for the outside world. I was impressed. While the requisite knowledge of a CS degree remains, they have augmented the curriculum to include more group work, assisted (paired) programming, and branched into new areas such as data analytics and software innovation. Additionally, they encourage responsible computer science practices with some assistance through the Mozilla Foundation’s Responsible Computer Science Challenge. This focus will certainly give students an advantage over more theory-heavy programs.

    As I got an overview of the CS curriculum, it occurred to me that I can, and should, be doing more to help guide the future of our industry. At work, I can do that through mentoring and knowledge sharing, but, as an alumnus, I can provide similar mentoring and knowledge sharing, as well as some much needed networking to young students. Why? I never want to be the smartest one in the room.

    I was never the smartest guy in the room. From the first person I hired, I was never the smartest guy in the room. And that’s a big deal. And if you’re going to be a leader – if you’re a leader and you’re the smartest guy in the world – in the room, you’ve got real problems.

    Jack Welch
  • Tech Tip – Markdown Linting in VS Code

    With a push to driving better documentation, it is worth remembering that Visual Studio Code has a variety of extensions that can help with linting/formatting of all types of files, including your README.md files.

    Markdown All in One and markdownlint are my current extensions of choice, and they have helped me clean up my README.md files in both personal and professional projects.

  • Not everything you read on the internet is true….

    I spend a lot of time searching the web for tutorials, walkthroughs, and examples. So much so, in fact, that “Google Search” could be listed as a top skill on my resume. With that in mind, though, it’s important to remember that not everything you read on the Internet is true, and to take care in how you approach things.

    A simple plan

    I was trying to create a new ASP.NET / .Net 6 application that I could use to test connectivity to various resources in some newly configured Kubernetes clusters. When I used the Visual Studio 2022 templates, I noticed that the new “minimal” styling was used in the template. This is my first opportunity to try the new minimal styling, so I looked to the web for help. I came across this tutorial on Medium.com.

    I followed the tutorial and, on my local machine, it worked like a charm. So I built a quick container image and started deploying into my Kubernetes cluster.

    Trouble brewing

    When I deployed into my cluster, I kept receiving errors about SQL connection errors. Specifically, that the server was not found. I added some logging, and the connection string seemed correct, but nothing was working.

    I thought it might be a DNS error within the cluster, so I spent at least 2 hours trying to determine if there was a DNS issues in my cluster. I even tried another cluster to see if it had to do with our custom subnet settings in the original cluster.

    After a while, I figured out the problem, and was about ready to quit my job. The article has a step to override the OnConfiguring method in the DbContext, like this:

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
            {
                var configuration = new ConfigurationBuilder()
                    .SetBasePath(Directory.GetCurrentDirectory())
                    .AddJsonFile("appsettings.json")
                    .Build();
    
                var connectionString = configuration.GetConnectionString("AppDb");
                optionsBuilder.UseSqlServer(connectionString);
            }

    I took this as necessary and put it in there. But what I glossed over was that this will run every time migrations happen. And the configuration is ONLY pulling from appsettings.json, not from environment variables or any other configuration sources.

    That “Oh……” moment

    At that point I realized that I had been troubleshooting the fact that this particular function was ignoring the connection string I was passing in using environment variables. To make matters worse, the logging I added that printed the connection string was using the full configuration (with environment variables), so it looked like the value was changing. In reality, it was the value from appSettings.json the whole time.

    The worst part: that override is completely unnecessary. I removed the entire function, and my code operates as normal.

    No Hard Feelings

    Let me be clear, though: as written, the article does a fine job of walking you through the process of setting up an ASP.NET / .Net 6 application in the minimal API styling. My issue was not recognizing how that OnConfiguring override would act in the container, and then bouncing around everywhere to figure it out.

    I will certainly be more careful about examining the tutorial code before traipsing around in Kubernetes command lines and DNS tools.

  • Can Yellowstone teach us about IT infrastructure management?

    It seems almost too fitting that, at a time when the popularity of gritty television like Yellowstone and 1883 is climbing, that I write to encourage you to stop taking on new pets and to start running a cattle ranch.

    Pet versus Cattle – The IT Version

    The “pet versus cattle” analogy is often used to describe different methodologies for IT management. You treat a pet by giving it a name like Zeus or Apollo. You give it a nice home home. You nurse it back to health when it’s sick. Cattle, on the other hand, get an ear tag with a number and are sent to roam the field until they are needed.

    Carrying this analogy into IT, I have seen my shared of pet servers. We build them up, nurse them back to health after a virus, upgrade them when needed, and do all the things a good pet owner would do. And when these go down, people notice. “Cattle” servers, on the other hand, are quickly provisioned and just as quickly replaced, often without any maintenance or downtime.

    Infrastructure as Code

    In its most basic definition, infrastructure as code is the idea that an infrastructure’s definition can be defined in some file (preferably source controlled). Using various tools, we can take this definition and create the necessary infrastructure components automatically.

    Why do we care if we can provision infrastructure from code? Treating our servers as cattle requires a much better management structure than “have Joe stand this up for us.” Sure, Joe is more than capable of creating and configuring all of the necessary components, but if we want to do it again, it requires more of Joe’s time.

    With the ability to define an infrastructure one time and deploy it many times, we gain the capacity to worry more about what is running on the infrastructure than the infrastructure itself.

    Putting ideas to action

    Changing my mindset from “pet” to “cattle” with virtual machines on my home servers has been tremendously valuable for me. As I mentioned in my post about Packer.io, you become much more risk tolerant when you know a new VM is 20 unattended minutes away.

    I have started to encourage our infrastructure team to accept nothing less than infrastructure as code for new projects. In order to lead by example, I have been working through creating Azure resources for one of our new projects using Terraform.

    Terraform: Star Trek or Nerd Tech?

    You may be asking: “Aren’t they both nerd tech?” And, well, you are correct. But when you have a group of people who grow up on science fiction and are responsible for keeping computers running, well, things get mixed up.

    Terraform the Hashicorp product is one of a number of tools which allow infrastructure engineers to automatically provision environments to host their products using various providers. I have been using its Azure Resource Manager provider, although there are more than a few available.

    While I cannot share the project code, here is what I was able to accomplish with Terraform:

    • Create and configure an Azure Kubernetes Service (AKS) instance
    • Create and configure an Azure SQL Server with multiple databases
    • Attach these instances to an existing virtual network subnet
    • Create and configure an Azure Key Vault
    • Create and configure a public IP address from an existing prefix.

    Through more than a few rounds of destroy and recreate, the project is in a state where it is ready to be deployed in multiple environments.

    Run a ranch

    So, perhaps, Yellowstone can teach us about IT Infrastructure management. A cold and calculated approach to infrastructure will lead to a homogenous environment that is easy to manage, scale, and replace.

    For what it’s worth, 1883 is simply a live-action version of Oregon Trail…

  • A little open source contribution

    The last month has been chock full of things I cannot really post about publicly, namely, performance reviews and security remediations. And while the work front has not been kind to public posts, I have taken some time to contribute back a bit more to the Magic Mirror project.

    Making ToDo Better

    Thomas Bachmann created MMM-MicrosoftToDo, a plugin for the Magic Mirror that pulls tasks from Microsoft’s ToDo application. Since I use that app for my day to day tasks, it would be nice to see my tasks up on the big screen, as it were.

    Unfortunately, the plugin used the old beta version of the APIs, as well as the old request module, which has been deprecated. So I took the opportunity to fork the repo and make some changes. I submitted a pull request to the owner, hopefully it makes its way into the main plugin. But, for now, if you want my changes, check them out here.

    Making StatusPage better

    I also took the time to improve on my StatusPage plugin, adding the ability to ignore certain components and removing components from the list when they are a part of an incident. I also created a small enhancement list for some future use.

    With the holidays and the rest of my “non-public” work taking up my time, I would not expect too much from me for the rest of the year. But I’ve been wrong before…

  • Git, you are messing with my flow!

    The variety of “flows” for developing using Git makes choosing the right one for your team difficult. When you throw true continuous integration and delivery into that, and add a requirement for immutable build objects, well…. you get a heaping mess.

    Infrastructure As Code

    Some of my recent work to help one of our teams has been creating a Terraform project and Azure DevOps pipeline based on a colleague’s work in standardizing Kubernetes cluster deployments in AKS. This work will eventually get its own post, but suffice to say that I have a repository which is a mix of Terraform files, Kubernetes manifests, Helm value files, and Azure DevOps pipeline definitions to execute them all.

    As I started looking into this, it occurred to me that there really is clear way to manage this project. For example, do I use a single pipeline definition with stages for each environment (dev/stage/production)? This would mean the Git repo would have one and only one branch (main), and each stage would need some checks (manual or automatic) to ensure rollout is controlled.

    Alternatively, I can have an Azure Pipeline for each environment. This would mean that each pipeline could trigger on its own branch. However, it also means that no standard Git Flow seems to work well with this.

    Code Flows

    Quite separately, as the team dove into creating new repositories, the question again came up around branching strategy and subsequent environment strategies for CI/CD. I am a vocal proponent of immutable build objects, but how each team chooses to get there is up to them.

    In some MS Teams channel discussion, there are pros and cons to nearly all of our current methods, and the team seems stuck on the best way to build and develop.

    The Internet is not helping

    Although it is a few years old, Scott Shipp’s War of the Git Flows article highlights the ongoing “flow wars.” One of the problems with Git is the ease with which all of these variations can be implemented. I am not blaming Git, per say, but because it is easy for nearly anyone to suggest a branching strategy, things get muddled quickly.

    What to do?

    Unfortunately, as with many things in software, there is no one right answer. The branch strategy you use depends on the requirements of not just the team, but the individual repository’s purpose. Additional requirements may come from the desired testing, integration, and deployment process.

    With that in mind, I am going to make two slightly radical suggestions

    1. Select a flow that is appropriate for the REPOSITORY, not your team.
    2. Start backwards. Working on identifying the requirements for your deployment process first. Answer questions like this:
    • How will artifacts, once created, be tested?
    • Will artifacts progress through lower environments?
    • Do you have to support old releases?
    • Where (what branch) can a release candidate artifacts come from? Or, what code will ultimately be applied to production?

    That last one is vital: defining on paper which branches will either be applied directly to production (in the case of Infrastructure as Code) or generate artifacts that can end up in production (in the case of application builds) will help outline the requirements for the project’s branching strategy.

    Once you have identified that, the next step is defining the team’s requirements WITHIN the above. In other words, try not to have the team hammer a square peg into a round hole. They have a branch or two that will generate release code, how they get their code into that branch should be as quick and direct as possible while supporting the collaboration necessary.

    What’s next

    What’s that mean for me? Well, for the infrastructure project, I am leaning towards a single pipeline, but I need to talk to the Infrastructure team to make sure they agree. As to the software team, I am going to encourage them to apply the process above for each repository in the project.

  • ISY and the magic network gnomes

    For nearly 2 years, I struggled mightily with communication issues between my ISY 994i and some of my docker images and servers. So much, in fact, that I had a fairly long running post in the Universal Devices forums dedicated to the topic.

    I figure it is worth a bit of a rehash here, if only to raise the issue in the hopes that some of my more network-experienced contacts can suggest a fix.

    The Beginning

    The initial post was essentially about my ASP.NET Core API (.net Core 2.2 at the time) not being able to communicate with the ISY’s REST API. You can read through the initial post for details, but, basically, it would hit it once, then timeout on subsequent requests.

    It would seem that some time between my original post and the administrator’s reply, I set the container’s networking to host and the problem went away.

    In retrospect, I had not been heavily using that API anyway, so it may have just been hidden a bit better by the host network. In any case, I ignored it for a year.

    The Return

    About twenty (that’s right, 20) months later, I started moving my stuff to Kubernetes, and the issue reared its ugly head. I spent a lot of time trying to get some debug information from the ISY, which only confused me more.

    As I dug more into when it was happening, it occurred to me that I could not reliably communicate with the ISY from any of the VMs on my HP Proliant server. Also, and, more puzzling, I could not do a port 80 retrieval from the server itself to the ISY. Oddly, though, I’m able to communicate with other hardware devices on the network (such as my MiLight Gateway) from the server and it’s VMs. Additionally, the ISY responds to pings, so it is able to be reached.

    Time for a new proxy

    Now, one of the VMs on my server was an Ubuntu VM that was serving as an NGINX reverse proxy. For various reasons, I wanted to move that from a virtual machine to a physical box. This, it would seem, would be a good time to see if a new proxy leads to different results.

    I had an old Raspberry Pi 3B+ lying around, and that seemed like the perfect candidate for a stand alone proxy. So I did a quick image of an SD card with Ubuntu 20, copied my Nginx configuration files from the VM to the Pi, and re-routed my firewall traffic to the proxy.

    Not only did that work, but it solved the issue of ISY connectivity. Routing traffic through the PI, I am able to communicate with the ISY reliably from my server, all of my VMs, and other PCs on the network.

    But, why?

    Well, that is the million dollar question, and, frankly, I have no clue. Perhaps it has to do with the NIC teaming on the server, or some oddity in the network configuration on the server. But I burned way too many hours on it to want to dig more into it.

    You may be asking, why a hardware proxy? I liked the reliability and smaller footprint of a dedicated Raspberry PI proxy, external to the server and any VMs. It made the networking diagram much simpler, as traffic now flows neatly from my gateway to the proxy and then to the target machine. It also allows me to control traffic to the server in a more granular fashion, rather than having ALL traffic pointed to a VM on the server, and then routed via proxy from there.