Tag: Terraform

  • Terraform Azure DevOps

    As a continuation of my efforts to use Terraform to manage my Azure Active Directory instance, I moved my Azure DevOps instance to a Terraform project, and cleaned a lot up in the process.

    New Project, same pattern

    As I mentioned in my last post, I setup my repository to support multiple Terraform projects. So starting up an Azure DevOps Terraform project was as simple as creating a new folder in the terraform folder and setting up the basics.

    As with my Azure AD project, I’m using the S3 backend. For providers, this project only needs the Azure DevOps and Hashicorp Vault providers.

    The process was very similar to Azure AD: create resources in the project, and use terraform import to import existing resources to be managed by the project. In this case, I tried to be as methodical as possible, following the following pattern:

    1. Import a project.
    2. Import the project’s service connections.
    3. Import the project’s variable libraries.
    4. Import the project’s build pipelines.

    This order ensured that I was bringing objects into the project in an order where I could then reference them for their child projects.

    Handling Secrets

    When I got to service connections and libraries, it occurred to me that I needed to pull secrets out of my Hashicorp Vault instance to make this work smoothly. This is where the Vault provider came in handy: using the data resource type in Terraform, I could pull secrets out of my key vault and have them available for my project.

    Not only does this keep secrets out of the files (which is why I can share them all in Github), but it also means that cycling these secrets is as simple as changing the secret in Vault and then re-running the Terraform apply. While I am not yet using this to its fullest extent, I have some ambitions to cycle these secrets automatically on a weekly basis.

    Github Authentication

    One thing I ran into was the authentication between Azure DevOps and Github. The ADO UI likes to use the built-in “Github app” authentication. Meaning, when you click on the Edit button in a pipeline, ADO defaults to asking Github for “app” permissions. This also happens if you manually create a new pipeline in the User Interface. This automatically creates a service connection in the project.

    You cannot create this service connection in a Terraform project, but you can let Terraform see it as a managed resource. To do that:

    1. Find the created service connection in your Azure DevOps project.
    2. Create a new azuredevops_serviceendpoint_github resource in your Terraform Project with no authentication block. Here is mine for reference.
    3. Import the service connection to the newly created Terraform Resource.
    4. Make sure description is explicitly set to a blank string: ""

    That last step got me: If you don’t explicitly set that value to blank, the provider tried to set the description as “Managed by Terraform”. When doing that, it attempts to validate the change, and since we have no authentication block, it fails.

    What are those?!?

    An interesting side effect to this effort is seeing all the junk that exists in my Azure DevOps projects. I say “junk,” but I mean unused variable libraries and service connections. This triggered my need for digital tidiness, so rather than importing, I deleted.

    I even went so far as to review some of the areas where service connections were passed into a pipeline, but never actually used. I ended up modifying a number of my Azure DevOps pipeline templates (and documenting them) to stop requiring connections that they ultimately were not using.

    It’s not done until it is automated!

    This is all great, but the point of Terraform is to keep my infrastructure in the state I intend it to be. This means automating the application of this project. I created a template pipeline in my repository that I could easily extend for new projects.

    I have a task on my to-do list to automate the execution of the Terraform plan on a daily basis and notify me if there are unexpected changes. This will serve as an alert that my infrastructure has changed, potentially unintentionally. For now, though, I will execute the Terraform plan/apply manually on a weekly basis.

  • Terraform Azure AD

    Over the last week or so, I realized that while I bang the drum of infrastructure as code very loudly, I have not been practicing it at home. I took some steps to reconcile that over the weekend.

    The Goal

    I have a fairly meager home presence in Azure. Primarily, I use a free version of Azure Active Directory (now Entra ID) to allow for some single sign-on capabilities in external applications like Grafana, MinIO, and ArgoCD. The setup for this differs greatly among the applications, but common to all of these is the need to create applications in Azure AD.

    My goal is simple: automate provisioning of this Azure AD account so that I can manage these applications in code. My stretch goal was to get any secrets created as part of this process into my Hashicorp Vault instance.

    Getting Started

    The plan, in one word, is Terraform. Terraform has a number of providers, including both the azuread and vault providers. Additionally, since I have some experience in Terraform, I figured it would be a quick trip.

    I started by installing all the necessary tools (specifically, the Vault CLI, the Azure CLI, and the Terraform CLI) in my WSL instance of Ubuntu. Why there instead of Powershell? Most of the tutorials and such lean towards the bash syntax, so it was a bit easier to roll through the tutorials without having to convert bash into powershell.

    I used my ops-automation repository as the source for this, and started by creating a new folder structure to hold my projects. As I anticipated more Terraform projects to come up, I created a base terraform directory, and then an azuread directory under that.

    Picking a Backend

    Terraform relies on state storage. They use the term backend to describe this storage. By default, Terraform uses a local file backend provider. This is great for development, but knowing that I wanted to get things running in Azure DevOps immediately, I decided that I should configure a backend that I can use from my machine as well as from my pipelines.

    As I have been using MinIO pretty heavily for storage, it made the most sense to configure MinIO as the backend, using the S3 backend to do this. It was “fairly” straightforward, as soon as I turned off all the nonsense:

    terraform {
      backend "s3" {
        skip_requesting_account_id  = true
        skip_credentials_validation = true
        skip_metadata_api_check     = true
        skip_region_validation      = true
        use_path_style              = true
        bucket                      = "terraform"
        key                         = "azuread/terraform.tfstate"
        region                      = "us-east-1"
      }
    }

    There are some obvious things missing: I am setting environment variables for values I would like to treat as secret, or, at least not public.

    • MinIO Endpoint -> AWS_ENDPOINT_URL_S3 environment variable instead of endpoints.s3
    • Access Key -> AWS_ACCESS_KEY_ID environment variable instead of access_key
    • Secret Key -> AWS_SECRET_ACCESS_KEY environment variable instead of secret_key

    These settings allow me to use the same storage for both my local machine and the Azure Pipeline.

    Configuration Azure AD

    Likewise, I needed to configure the azuread provider. I followed the steps in the documentation, choosing the environment variable route again. I configured a service principal in Azure and gave it the necessary access to manage my directory.

    Using environment variables allows me to set these from variables in Azure DevOps, meaning my secrets are stored in ADO (or Vault, or both…. more on that in another post).

    Importing Existing Resources

    I have a few resources that already exist in my Azure AD instance, enough that I didn’t want to re-create them and then re-configure everything which uses them. Luckily, most Terraform providers allow for importing existing resources. Thankfully, most of the resources I have support this feature.

    Importing is fairly simple: you create the simplest definition of a resource that you can, and then run a terraform import variant to import that resource into your project’s state. Importing an Azure AD Application, for example, looks like this:

    terraform import azuread_application.myapp /applications/<object-id>

    It is worth noting that the provider is looking for the object-id, not the client ID. The provider documentation has information as to which ID each resource uses for import.

    More importantly, Applications and Service Principals are different resources in Azure AD, even though they are pretty much a one to one. To import a Service Principal, you run a similar command:

    terraform import azuread_service_principal.myprincipal <sp-id>

    But where is the service principal’s ID? I had to go to the Azure CLI to get that info:

    az ad sp list --display myappname

    From this JSON, I grabbed the id value and used that to import.

    From here, I ran a terraform plan to see what was going to be changed. I took a look at the differences, and even added some properties to the terraform files to maintain consistency between the app and the existing state. I ended up with a solid project full of Terraform files that reflected my current state.

    Automating with Azure DevOps

    There are a few extensions available to add Terraform tasks to Azure DevOps. Sadly, most rely on “standard” configurations for authentication against the backends. Since I’m using an S3 compatible backend, but not S3, I had difficulty getting those extensions to function correctly.

    As the Terraform CLI is installed on my build agent, though, I only needed to run my commands from a script. I created an ADO template pipeline (planning for expansion) and extended it to create the pipeline.

    All of the environment variables in the template are reflected in the variable groups defined in the extension. If a variable is not defined, it’s simply blank. That’s why you will see the AZDO_ environment variables in the template, but not in the variable groups for the Azure AD provisioning.

    Stretch: Adding Hashicorp Vault

    Adding HC Vault support was somewhat trivial, but another exercise in authentication. I wanted to use AppRole authentication for this, so I followed the vault provider’s instructions and added additional configuration to my provider. Note that this setup requires additional variables that now need to be set whenever I do a plan or import.

    Once that was done, I had access to read and write values in Vault. I started by storing my application passwords in a new key vault. This allows me to have application passwords that rotate weekly, which is a nice security feature. Unfortunately, the rest of my infrastructure isn’t quite setup to handle such change. At least, not yet.

  • You really should get that documented…

    One of the most important aspects of a software engineer/architect is to document what they have done. While it is great that they solved the problem, that solution will become a problem for someone else if the solution has not been well documented. Truthfully, I have caused my own problems when revisiting old, improperly documented code.

    Documentation Generation

    Most languages today have tools that will extract code comments and turn them into formatted content. I used mkdocs to generate documentation for my simple Python monitoring tool.

    Why generate from the code? The simple reason is, if the documentation lives within the code, then it is more likely to be updated. Requiring a developer to jump into another repository or project takes time and can be a forgotten step in development. The more automatic your documentation is, the more likely it is to get updated.

    Infrastructure as Code (IaC) is no different. Lately I have been spending some time in Terraform, and sought out a solution for documenting Terraform. terraform-docs to the rescue!

    Documenting Terraform with terraform-docs

    The Getting Started guides for terraform-docs are pretty straight forward and allow you to generate content in a variety of different targets. All I really wanted was a simple README.md file for my projects and any sub-modules, so I left the defaults pretty much as-is.

    I typically structure my repositories with a main terraform folder, and sub-modules under that if needed. Without any additional configuration, this command worked like a charm:

    terraform-docs markdown document --output-file README.md .\terraform\ --recursive

    It generated a README.md file for my main module and all sub-modules. I played a little with configuration, mostly to set up default values in the configuration YAML file so that others could run a simpler command:

    terraform-docs --output-file README.md .\terraform\ --recursive

    I will get a pre-commit hook configured in my repositories to run this command before a commit to ensure the documents are always up to date.

  • Pack, pack, pack, they call him the Packer….

    Through sheer happenstance I came across a posting for The Jaggerz playing near me and was taken back to my first time hearing “The Rapper.” I happened to go to school with one of the member’s kids, which made it all the more fun to reminisce.

    But I digress. I spent time a while back getting Packer running at home to take care of some of my machine provisioning. At work, I have been looking for an automated mechanism to keep some of our build agents up to date, so I revisited this and came up with a plan involving Packer and Terraform.

    The Problem

    My current problem centers around the need to update our machine images weekly, but still using Terraform to manage our infrastructure. In the case of Azure DevOps, we can provision VM Scale Sets and assign those Scale Sets to an Azure DevOps agent pool. But, when I want to update that image, I can do it two different ways:

    1. Using Azure CLI, I can update the Scale Set directly.
    2. I can modify the Terraform repository to update the image and then re-run Terraform.

    Now, #1 sounds easy, right? Run command and I’m done. But it then defeats the purpose of Terraform, which is to maintain infrastructure as code. So, I started down path #2.

    Packer Revisit

    I previously used Packer to provision Hyper-V VMs, but the provisioner for azure-rm is pretty similar. I was able to configure a simple windows based VM and get the only application I needed installed with a Powershell script.

    One app? On a build agent? Yes, this is a very particular agent, and I didn’t want to install it everywhere, so I created a single agent image with the necessary software.

    Mind you, I have been using the runner-images Packer projects to build my Ubuntu agent at home, and we use them to build both Windows and Ubuntu images at work, so, by comparison, my project is wee tiny. But it gives me a good platform to test. So I put a small repository together with a basic template and a Powershell script to install my application, and it was time to build.

    Creating the Build Pipeline

    My build process should be, for all intents and purposes, one step that runs the packer build command, which will create the image in Azure. I found the PackerBuild@1 task, and thought my job was done. It would seem that the Azure DevOps task hasn’t kept up with the times, either that, or Packer’s CLI needs help.

    I wanted to use the PackerBuild@1 task to take advantage of the service connection. I figured, if I could run the task with a service connection, I wouldn’t have to store service principal credential in a variable library. As it turns out… well, I would have to do that anyway.

    When I tried to run the task, I got an error that “packer fix only supports json.” My template is in HCL format, and everything I have seen suggests that Packer would rather move to HCL. Not to be beaten, I looked at the code for the task to see if I could skip the fix step.

    Not only could I not skip that step, but when I dug into the task, I noticed that I wouldn’t be able to use the service connection parameter with a custom template. So with that, my dreams of using a fancy task went out the door.

    Plan B? Use Packer’s ability to grab environment variables as default values and set the environment variables in a Powershell script before I run the Packer build. It is not super pretty, but it works.

    - pwsh: | 
        $env:ARM_CLIENT_ID = "$(azure-client-id)"
        $env:ARM_CLIENT_SECRET = "$(azure-client-secret)"
        $env:ARM_SUBSCRIPTION_ID = "$(azure-subscription-id)"
        $env:ARM_TENANT_ID = "$(azure-tenant-id)"
        Invoke-Expression "& packer build --var-file values.pkrvars.hcl -var vm_name=vm-image-$(Build.BuildNumber) windows2022.pkr.hcl"
      displayName: Build Packer

    On To Terraform!

    The next step was terraforming the VM Scale Set. If you are familiar with Terraform, the VM Scale Set resource in the AzureRM provider is pretty easy to use. I used the Windows VM Scale Set, as my agents will be Windows based. The only “trick” is finding the image you created, but, thankfully, that can be done by name using a data block.

    data "azurerm_image" "image" {
      name                = var.image_name
      resource_group_name = data.azurerm_resource_group.vmss_group.name
    }

    From there, just set source_image_id to data.azurerm_image.image.id, and you’re good. Why look this up by name? It makes automation very easy.

    Gluing the two together

    So I have a pipeline that builds an image, and I have another pipeline that executes the Terraform plan/apply steps. The latter is triggered on a commit to main in the Terraform repository, so how can I trigger a new build?

    All I really need to do is “reach in” to the Terraform repository, update the variable file with the new image name, and commit it. This can be automated, and I spent a lot of time doing just that as part of implementing our GitOps workflow. In fact, as I type this, I realize that I probably owe a post or two on how exactly we have done that. But, using some scripted git commands, it is pretty easy.

    So, my Packer build pipeline will checkout the Terraform repository, change the image name in the variable file, and commit. This is where the image name is important: Packer spit out the Azure Image ID (at least, not that I saw), so having a known name makes it easy for me to just tell Terraform to use the new image name, and it uses that to look up the value.

    What’s next?

    This was admittedly pretty easy, but only because I have been using Packer and Terraform for some time now. The learning curve is steep, but as I look across our portfolio, I can see areas where these types of practices can help us by allowing us to build fresh machine images on a regular cadence, and stop treating our servers as pets. I hope to document some of this for our internal teams and start driving them down a path of better deployment.