Author: Matt

  • Platform Engineering

    As I continue to build out some reference architecture applications, I realized that there was a great deal of boilerplate code that I add to my APIs to get things running. Time for a library!

    Enter the “Platform”

    I am generally terrible at naming things, but Spydersoft.Platform seemed like a good base namespace for this one. The intent is to put the majority of my boilerplate code into a set of libraries that can be referenced to make adding stuff easier.

    But, what kind of “stuff?” Well, for starters

    • Support for OpenTelemetry trace, metrics, and logging
    • Serilog logging for console logging
    • Simple JWT identity authentication (for my APIs)
    • Default Health Check endpoints

    Going deep with Health Checks

    The first three were pretty easy: just some POCOs for options and then startup extensions to add the necessary items with the proper configuration. With health checks, however, I went a little overboard.

    My goal was to be able to implement IHealthCheck anywhere and decorate it in such a way that it would be added to the health check framework and could be tagged. Furthermore, I wanted to use tags to drive standard endpoints.

    In the end, I used a custom attribute and some reflection to add the checks that are found in the loaded AppDomain. I won’t bore you: the documentation should do that just fine.

    But can we test it?

    Testing startup extensions is, well, interesting. Technically, it is an integration test, but I did not want to setup playwright tests to execute the API tests. Why? Well, usually API integration tests are run again a particular configuration, but in this case, I needed to run the reference application with a lot of different configurations in order to fully test the extensions. Enter WebApplicationFactory.

    With WebApplicationFactory, I was able to configure tests to stand up a copy of the reference application with different configurations. I could then verify the configuration using some custom health checks.

    I am on the fence as to whether or not this is a “unit” test or an “integration” test. I’m not calling out to any other application, which is usually the definition of an integration test. But I did have to configure a reference application in order to get things tested.

    Whatever you call it, I have coverage on my startup extensions, and even caught a few bugs while I was writing the tests.

    Make it truly public?

    Right now, the build publishes the Nuget package to my private nuget feed. I am debating on moving it to Nuget (or maybe Github’s package feeds). While the code is open source, I want to make the library openly available. But until I make the decision on where to put it, I will keep it in my private feed. If you have any interest in it, watch or star the repo in GitHub: it will help me gauge the level of interest.

  • Supporting a GitHub Release Flow with Azure DevOps Builds

    It has been a busy few months, and with the weather changing, I have a little more time in front of the computer for hobby work. Some of my public projects were in need of a few package updates, so I started down that road. Most of the updates were pretty simple: a few package updates and some Azure DevOps step template updates and I was ready to go. However, I had been delaying my upgrade to GitVersion 6, and in taking that leap, I changed my deployment process slightly.

    Original State

    My current development process supports three environments: test, stage, and production. Commits to feature/* branches are automatically deployed to the test environment, and any builds from main are first deployed to stage and then can be deployed to production.

    For me, this works: I am usually only working on one branch at a time, so publishing feature branches to the test environment works. When I am done with a branch, I merge it into main and get it deployed.

    New State

    As I have been working through some processes at work, it occurred to me that versions are about release, not necessarily commits. While commits can help us number releases, they shouldn’t be the driving force. GitVersion 6 and its new workflow defaults drive this home.

    So my new state would be pretty similar: feature/* branches get deployed to the test environment automatically. The difference lies in main: I no longer want to release with every commit to main. I want to be able to control releases through the use of tags (and GitHub releases, which generate tags.

    So I flipped over to GitVersion 6 and modified my GitVersion.yml file:

    workflow: GitHubFlow/v1
    merge-message-formats:
      pull-request: 'Merge pull request \#(?<PullRequestNumber>\d+) from'
    branches:
      feature:
        mode: ContinuousDelivery

    I modified my build pipeline to always build, but only trigger code release for feature/* branch builds and builds from a tag. I figured this would work fine, but Azure DevOps threw me a curve ball.

    Azure DevOps Checkouts

    When you build from a tag, Azure DevOps checks that tag out directly, using the /tags/<tagname> branch reference. When I tried to run GitVersion on this, I got a weird branch number: A build on tag 1.3.0 resulted in 1.3.1-tags-1-3-0.1.

    I dug into GitVersion’s default configuration, and noticed this corresponded with the unknown branch configuration. To get around Azure Devops, I had to do configure the tags/ branches:

    workflow: GitHubFlow/v1
    merge-message-formats:
      pull-request: 'Merge pull request \#(?<PullRequestNumber>\d+) from'
    branches:
      feature:
        mode: ContinuousDelivery
      tags:
        mode: ManualDeployment
        label: ''
        increment: Inherit
        prevent-increment:
          when-current-commit-tagged: true
        source-branches:
        - main
        track-merge-message: true
        regex: ^tags?[/-](?<BranchName>.+)
        is-main-branch: true

    This treats tags as main branches when calculating the version.

    Caveat Emptor

    This works if you ONLY tag your main branch. If you are in the habit of tagging other branches, this will not work for you. However, I only ever release from main branches, and I am in a fix-forward scenario, so this works for me. If you use release/* branches and need builds from there, you may need additional GitVersion configuration to get the correct version numbers to generate.

  • When “as code” makes a difference

    I spent a considerable amount of time setting up my home lab with a high degree of infrastructure and deployment “as code.” Googling “Infrastructure as Code” or “Declarative GitOps” will highlight the breadth of this topic, and I have no less than 10 different posts on my current setup. So what did all this effort get me?

    Effortless Updates

    A quick Powershell script lets me update my GitOps repositories with the latest versions of the applications I am running. With the configurability of ArgoCD, however, those updates are not immediately rolled out. My ArgoCD configurations are setup for manual sync, which gives me the ability to compare changes before they are applied.

    Could I automatically sync? Well, sure, and 9 times out of 10, it would work just fine. But more than once, I ran into updates which required some additional preparation or conversion, so I still have the ability to hold off on upgrades until I am ready.

    Helpful Rollbacks

    Even after synchronization, sometimes things do not go according to plan. Recently, as an example, an upgrade to Argo 2.12 broke my application sets because of a templating issue. Had I been manually managing my applications, that would have meant a manual downgrade or a hacky workaround. Now, well, I just rolled back to the previous version that I had deployed and will patiently await a fix.

    Disaster Recovery

    My impatience caused me to wreck my non-production cluster beyond repair. With my declarative GitOps setup, restoring that cluster was pretty simple:

    • Create a new cluster
    • Add the new cluster to ArgoCD
    • Modify the cluster secret in Argo with labels to install my cluster tools
    • Modify the applications to use the new cluster URL

    As it was my non-production instance, I did not have any volumes/data that needed transferred over, so I have not yet tested that particular bit. However, since my volumes are mounted with consistent name generation, I believe data transfers should work equally well.

    Conclusion

    Even in my home lab, a level of “as code” helps keep things running smoothly. You should try it!

  • A Quick WSL Swap

    I have been using WSL and Ubuntu 22.04 a lot more in recent weeks. From virtual environments for Python development to the ability to use Podman to run container images, the tooling supports some of the work I do much better than Windows does.

    But Ubuntu 22.04 is old! I love the predictable LTS releases, but two years is an eternity in software, and I was looking forward to the 22.04 release.

    Upgrade or Fresh Start?

    I looked at a few options for upgrading my existing Ubuntu 22.04 WSL instance, but I really did not like what I read. The guidance basically suggested it was a “try at your own risk” scenario.

    I took a quick inventory of what was actually on my WSL image. As it turns out, not too much. Aside from some of my standard profile settings, I only have a few files that were not available in some of my Github repositories. Additionally, since you can have multiple instances of WSL running, the easiest solution I could find was to stand up a new 24.04 image and copy my settings and files over.

    Is that it?

    Shockingly, yes. Installing 24.04 is as simple as opening it in the Microsoft store and downloading it. Once that was done, I ran through the quick provisioning to setup the basics, and then copied my profile and file.

    I was able to utilize scp for most of the copying, although I also realized that I could copy files from Windows using the \\wsl.localhost paths. Either way, it didn’t take very long before I had Ubuntu 24.04 up and running.

    I still have 22.04 installed, and I haven’t deleted that image just yet. I figure I’ll keep it around for another month and, if I don’t have to turn it back on, I probably don’t need anything on it.

  • My Very Own Ship of Theseus

    A while back, I wrote a little about how the “Ship of Theseus” thought experiment has parallels to software design. What I did not realize is that I would end up running into a physical “Ship of Theseus” of my own.

    Just another day

    On a day where I woke up to stories of how a Crowdstrike update wreaked havoc with thousands of systems, I was overly content with my small home lab setup. No Crowdstrike installed, primarily Ubuntu nodes… Nothing to worry about, right?

    Confident that I was in the clear, I continued the process of cycling my Kubernetes nodes to use Ubuntu 24.04. I have been pretty methodical about this, just to make sure I am not going to run into anything odd. Having converted my non-production cluster last week, I started work on my internal cluster. I got the control plane nodes updated, but the first agent I tried was not spinning up correctly.

    Sometimes my server gets a little busy, and a quick reset helps clear some of the background work. So I reset… And it never booted again.

    What Happened?

    The server would boot to a certain point (right after the Thermal Calibration step), hang for about 10-15 minutes, and then report a drive array failure. Uh oh…

    I dug through some logs on the Integrated Lights Out system and did some Google sleuthing on the errors I was seeing. The conclusion I came to was that the on-board drive controller went kaput. At this point, I was dead in the water. And then I remembered I had another server…

    Complete Swap

    The other server was much lighter on spec: a single 8 core CPU, 64 GB of RAM, and nowhere near the disk space. Not to mention, with a failed drive controller, I wasn’t getting any data off of those RAID disks.

    But the servers themselves are both HP ProLiant DL380P Gen 8 servers. So I starting thinking, could I just transfer everything except the system board to the backup server?

    The short answer: Yes.

    I pulled all the RAM modules and installed them in the backup. I pulled both CPUs from the old server and installed them in the backup. I pulled all of the hard drives out and installed them in the backup. I even transferred both power backplanes so that I would have dual plugs.

    The Moment of Truth

    After all that was done, I plugged it back in and logged in to the backup server’s ILO. It started up, but pointed me to the RAID utilities, because one of the arrays needed rebuilt. A few hours later, the drives were rebuilt, and I restarted. Much to my shock, it booted up as if it were the old server.

    Is it a new server? or just a new system board in the old server? All I know is, it is running again.

    Now, however, I’m down on replacement parts, so I’m going to have to start thinking about either stocking up some replacements or looking in to a different lab setup.

  • Moving to Ubuntu 24.04

    I have a small home lab running a few Kubernetes clusters, and a good bit of automation to deal with provisioning servers for the K8 clusters. All of my Linux VMs are based on Ubuntu 22.04. I prefer to stick with LTS for stability and compatibility.

    As April turns into July (missed some time there), I figured Ubuntu’s latest LTS (24.04) has matured to the point that I could start the process of updating my VMs to the new version.

    Easier than Expected

    In my previous move from 20.04 to 22.04, there were some changes to the automated installers for 22.04 that forced me down the path of testing my packer provisioning with the 22.04 ISOs. I expected similar changes with 24.04. I was pleasantly surprised when I realized that my existing scripts should work well with the 24.04 ISOs.

    I did spend a little time updating the Azure DevOps pipeline that builds a base image so that it supports building both a 22.04 and 24.04 image. I want to make sure I have the option to use the 22.04 images, should I find a problem with 24.04

    Migrating Cluster Nodes

    With a base image provisioned, I followed my normal process for upgrading cluster nodes on my non-production cluster. There were a few hiccups, mostly around some of my automated scripts that needed to have the appropriate settings to set hostnames correctly.

    Again, other than some script debugging, the process worked with minimal changes to my automation scripts and my provisioning projects.

    Azure DevOps Build Agent?

    Perhaps in a few months. I use the GitHub runner images as a base for my self-hosted agents, but there are some changes that need manual review. I destroy my Azure DevOps build agent weekly and generate a new one, and that’s a process that I need to make sure continues to work through any changes.

    The issue is typically time: the build agents take a few hours to provision because of all the tools that are installed. Testing that takes time, so I have to plan ahead. Plus, well, it is summertime, and I’d much rather be in the pool than behind the desk.

  • Drop that zero…

    I ran into a very weird issue with Nuget packages and the old packages.config reference style.

    Nuget vs Semantic Versioning

    Nuget grew up in Windows, where assembly version numbers support four numbers: major.minor.build.revision. Therefore, NugetVersion supports all four version segments. Semantic versioning, on the other hand, supports three numbers plus additional labels.

    As part of Nuget’s version normalization, in an effort to better support semantic versioning, the fourth segment version is dropped if it’s zero. So 1.2.3.0 becomes 1.2.3. In general, this does not present any problems, since the version numbers are retrieved from the feed by the package manager tools and references updated accordingly.

    Always use the tools provided

    When you ignore the tooling, well, stuff can get weird. This is particularly true in the old packages.config reference style.

    In that style, packages are listed in a packages.config file, and the .Net project file adds a reference to the DLL with a HintPath. That HintPath includes the folder where the package is installed, something like this:

     <ItemGroup>
        <Reference Include="MyCustomLibrary, Version=1.2.3.4, Culture=neutral, processorArchitecture=MSIL">
          <HintPath>..\packages\MyCustomLibrary.1.2.3.4\lib\net472\MyCustomLibrary.dll</HintPath>
        </Reference>
    </ItemGroup>

    But, for argument’s sake, let us assume we publish a new version of MyCustomLibrary, version 1.2.4. Even though the AssemblyVersion might be 1.2.4.0, the Nuget version will be normalized to 1.2.4. And, instead of upgrading the package using one of the package manager tools, you just update the reference file manually, like this:

    <ItemGroup>
        <Reference Include="MyCustomLibrary, Version=1.2.4.0, Culture=neutral, processorArchitecture=MSIL">
          <HintPath>..\packages\MyCustomLibrary.1.2.4.0\lib\net472\MyCustomLibrary.dll</HintPath>
        </Reference>
    </ItemGroup>

    This can cause weird issues. It will most likely build with a warning about not being able to find the DLL. Depending on how the package is used or referenced, you may not get a build error (I didn’t get one). But the build did not include the required library.

    Moving on…

    The “fix” is easy: use the Nuget tools (either the CLI or Visual Studio Package Manager) to update the packages. It will generate the appropriate HintPath for the package that is installed. An even better solution is to migrate to project reference style, where the project includes the Nuget references, and packages.config is not used. This presents immediate errors if an incorrect version is used.

  • Isolating your Azure Functions

    I spent a good bit of time over the last two weeks converting our Azure functions from the in-process to the isolated worker process model. Overall the transition was fairly simple, but there were a few bumps in the proverbial road worth noting.

    Migration Process

    Microsoft Learn has a very detailed How To Guide for this migration. The guide includes steps for updating the project file and references, as well as additional packages that are required based on various trigger types.

    Since I had a number of functions to process, I followed the guide for the first one, and that worked swimmingly. However, then I got lazy and started the “copy-paste” conversion. In that laziness, I missed a particular section of the project file:

    <ItemGroup>
      <Using Include="System.Threading.ExecutionContext" Alias="ExecutionContext"/>
    </ItemGroup>

    Unfortunately, if you forget this, you will not break your local development environment. However, when you publish to a function, it will not execute correctly.

    Fixing Dependency Injection

    When using the in-process model, there are some “freebies” that get added to the dependency injection system, as if by magic. ILogger, in particular, was allowed to be automatically injected into the function (as a function parameter). However, in the in-process model, you must get ILogger from either the FunctionContext or through dependency injection into the class.

    As part of our conversion, we removed the function parameters for ILogger and replaced them with service instances retrieved through dependency injection at the class level.

    What we did not realize until we got our functions into the test environments was that IHttpContextAccessor was not available in the isolated model. Apparently, that particular interface is available as part of the in-process model automatically, but is not added as part of the isolated model. So we had to add an instance of IHttpContextAccessor to our services collection in the Program.cs file.

    It is never easy

    Upgrades or migrations are never just “change this and go.” as much as we try to make it easy, there always seems to be a little change here or there that end up being a fly in the ointment. In our case, we simply assumed that IHttpContextAccessor was there because in-process put it there, and the code which needed that was a few layers deep in the dependency tree. The only way to find it was to make the change and see what breaks. And that is what keeps quality engineers up at night.

  • I appreciate feedback, but..

    I am really tired of deleting hundreds of spam comments every couple of days. While I have had a few posts generate some good feedback, generally, all I get is spam.

    It was not too bad until the last few months, when spam volume increased by an order of magnitude. I would rather not burn resources, even for a few days, on ridiculous incoming spam.

    So, while I really appreciate any feedback on my posts, you will have to find another channel through which to contact me. The management of spam comments far outweighs anything I have gained from the comments I have received.

  • Hitting my underwater stride

    It’s not always about tech. A recent trip to Cozumel has only strengthened my resolve to continue my underwater adventures.

    Hitting the Road

    Neither my wife nor I have ever been to Cozumel. Sure, we have been to Mexico a few times, including taking my kids to Cancun the past few summers. But, and I cannot quite stress this enough, Cozumel isn’t quite Mexico. This quiet little island situated about 12 miles off of the Mexican shores of Quintana Roo is a tourist mecca.

    We were able to get 5 nights away this time. Rather than dive four mornings, we took the opportunity to rent a Jeep and drive around the island. You can pretty much divide Cozumel into 4 parts:

    1. Town: San Miguel de Cozumel is the port city where multiple cruise ships can dock and offload their thousands of passengers. Plenty of shops, restaurants, beach clubs, and activities are available.
    2. Leeward beaches: The leeward beaches on the west side of the island, south of town, are either resorts or beach clubs which charge an admission fee. Most of the coast is rocky, but little wave action and coarse white sand make for a great beach day.
    3. Windward beaches: The east side of the island has significantly more wave action, with some beaches that offer a little more fine sand (more waves = finer sand). Still rocky, but more opportunity for water activities like kite surfing and surfing.
    4. Nature preserve: The north end of Cozumel is mostly natural preserve. There are some beach clubs and islands north of town, but we did not venture in that direction.

    The island caters to cruise ships. Certain activities, including the Mayan ruins, are only open when cruise ships are in port. “No cruise ships, no money” was a phrase I heard more than a few times. As we rented our Jeep on a day with no cruise ships, we missed out on some of those activities. We also missed out on the mass of humanity coming from those ships, so I was not terribly mad.

    If you venture to Cozumel, bring cash! Many places on the east side of the island are remote, with little cellular signal of any kind. Many places do not take credit cards, or charge a service fee for using cards. The west side is a little more tourist friendly, but its never a bad idea to have some cash. Most places seem to accept the US dollar, but pesos aren’t a terrible idea.

    Jump In!

    The Cozumel barrier reef is part of one of the largest reef systems on the planet. A quick glance in the luggage area at the airport will tell you it is a scuba diver’s destination. There are a ton of dive shops on the island, so many that I used two different ones for my three dive days.

    Nearly everyone does a two tank dive, with prices ranging from $80 to $110 for each two tank dive. Gear rental is available, I had to rent a BCD and regulator, which put me back about $25 USD a day.

    In 6 dives, we dove 6 unique spots. Both dive shops did a “deep/shallow” dive, with deep dives being wall dives that range from 75-85 feet, and shallow dives in the 50-65 foot range. One thing that caught my attention was the lack of attention to certification levels.

    I got my PADI advanced open water certification last year so that I would be certified for depths up to 100 feet. PADI open water certifications are only certified to 60 feet. By that standard, you need an advanced open water certification to dive on the deeper walls. I’m fairly certain that some of the folks I dove with did not have that level of certification. Now, it is not my business: I will always dive within my limits. But taking someone to 80 feet when they have not had some of the additional training seems dangerous, not to mention a bit of a liability.

    Both dive houses, though, we accommodating during the dives. This trip marked dives 18 through 23 for me, but I can feel myself getting more comfortable. But, as comfortable as it is, it is never truly comfortable. There is an element of risk in every dive, and situational awareness is critical to keep yourself and your dive buddy safe. But I find myself becoming more aware with each dive, and with that awareness comes a great appreciation for the sights of the reef.

    What did I see? Well, a ton of aquatic life, but the highlights have to be a 6 ft blacktip shark, a sea turtle, a couple large rays, and a few large Caribbean lobsters.

    Next trip?

    These dives brought my grand total to 23. Diving in Cozumel, I’m sitting next to folks who are easily in the hundreds, but never once was I intimidated. I have been very fortunate: my dive groups have been nothing but helpful. I get helpful pointer on nearly every dive, and it has made me more comfortable in the water.

    The only question is, where to next?