Pack, pack, pack, they call him the Packer….

Through sheer happenstance I came across a posting for The Jaggerz playing near me and was taken back to my first time hearing “The Rapper.” I happened to go to school with one of the member’s kids, which made it all the more fun to reminisce.

But I digress. I spent time a while back getting Packer running at home to take care of some of my machine provisioning. At work, I have been looking for an automated mechanism to keep some of our build agents up to date, so I revisited this and came up with a plan involving Packer and Terraform.

The Problem

My current problem centers around the need to update our machine images weekly, but still using Terraform to manage our infrastructure. In the case of Azure DevOps, we can provision VM Scale Sets and assign those Scale Sets to an Azure DevOps agent pool. But, when I want to update that image, I can do it two different ways:

Using Azure CLI, I can update the Scale Set directly.
I can modify the Terraform repository to update the image and then re-run Terraform.

Now, #1 sounds easy, right? Run command and I’m done. But it then defeats the purpose of Terraform, which is to maintain infrastructure as code. So, I started down path #2.

Packer Revisit

I previously used Packer to provision Hyper-V VMs, but the provisioner for azure-rm is pretty similar. I was able to configure a simple windows based VM and get the only application I needed installed with a Powershell script.

One app? On a build agent? Yes, this is a very particular agent, and I didn’t want to install it everywhere, so I created a single agent image with the necessary software.

Mind you, I have been using the runner-images Packer projects to build my Ubuntu agent at home, and we use them to build both Windows and Ubuntu images at work, so, by comparison, my project is wee tiny. But it gives me a good platform to test. So I put a small repository together with a basic template and a Powershell script to install my application, and it was time to build.

Creating the Build Pipeline

My build process should be, for all intents and purposes, one step that runs the packer build command, which will create the image in Azure. I found the PackerBuild@1 task, and thought my job was done. It would seem that the Azure DevOps task hasn’t kept up with the times, either that, or Packer’s CLI needs help.

I wanted to use the PackerBuild@1 task to take advantage of the service connection. I figured, if I could run the task with a service connection, I wouldn’t have to store service principal credential in a variable library. As it turns out… well, I would have to do that anyway.

When I tried to run the task, I got an error that “packer fix only supports json.” My template is in HCL format, and everything I have seen suggests that Packer would rather move to HCL. Not to be beaten, I looked at the code for the task to see if I could skip the fix step.

Not only could I not skip that step, but when I dug into the task, I noticed that I wouldn’t be able to use the service connection parameter with a custom template. So with that, my dreams of using a fancy task went out the door.

Plan B? Use Packer’s ability to grab environment variables as default values and set the environment variables in a Powershell script before I run the Packer build. It is not super pretty, but it works.

- pwsh: | 
    $env:ARM_CLIENT_ID = "$(azure-client-id)"
    $env:ARM_CLIENT_SECRET = "$(azure-client-secret)"
    $env:ARM_SUBSCRIPTION_ID = "$(azure-subscription-id)"
    $env:ARM_TENANT_ID = "$(azure-tenant-id)"
    Invoke-Expression "& packer build --var-file values.pkrvars.hcl -var vm_name=vm-image-$(Build.BuildNumber) windows2022.pkr.hcl"
  displayName: Build Packer

On To Terraform!

The next step was terraforming the VM Scale Set. If you are familiar with Terraform, the VM Scale Set resource in the AzureRM provider is pretty easy to use. I used the Windows VM Scale Set, as my agents will be Windows based. The only “trick” is finding the image you created, but, thankfully, that can be done by name using a data block.

data "azurerm_image" "image" {
  name                = var.image_name
  resource_group_name = data.azurerm_resource_group.vmss_group.name
}

From there, just set source_image_id to data.azurerm_image.image.id, and you’re good. Why look this up by name? It makes automation very easy.

Gluing the two together

So I have a pipeline that builds an image, and I have another pipeline that executes the Terraform plan/apply steps. The latter is triggered on a commit to main in the Terraform repository, so how can I trigger a new build?

All I really need to do is “reach in” to the Terraform repository, update the variable file with the new image name, and commit it. This can be automated, and I spent a lot of time doing just that as part of implementing our GitOps workflow. In fact, as I type this, I realize that I probably owe a post or two on how exactly we have done that. But, using some scripted git commands, it is pretty easy.

So, my Packer build pipeline will checkout the Terraform repository, change the image name in the variable file, and commit. This is where the image name is important: Packer spit out the Azure Image ID (at least, not that I saw), so having a known name makes it easy for me to just tell Terraform to use the new image name, and it uses that to look up the value.

What’s next?

This was admittedly pretty easy, but only because I have been using Packer and Terraform for some time now. The learning curve is steep, but as I look across our portfolio, I can see areas where these types of practices can help us by allowing us to build fresh machine images on a regular cadence, and stop treating our servers as pets. I hope to document some of this for our internal teams and start driving them down a path of better deployment.