Category: Home Lab

  • Simple Site Monitoring with Raspberry PI and Python

    My off-hours time this week has been consumed by writing some Python scripts to help monitor uptime for some of my sites.

    Build or Buy?

    At this point in my career, “build or buy” is a question I ask more often than not. As a software engineer, there is no shortage of open source and commercial solutions for almost any imaginable task. Web Site monitoring is no different. Tools such as StatusCake, Pingdom, and LogicMonitor offer hosted platforms, while tools like Nagios and PRTG offer on-premise installations, there is so much to choose from, it’s hard to decide.

    I had a few simple requirements:

    • Simple at first, but expandable as needed.
    • Runs on my own network so that I can monitor sites that are not available outside of my firewall.
    • Since most of my servers are virtual machines consolidated on the one lab server, well, it does not make much sense to put it on that server. I needed something I could run on easily with little to no power.

    Build it is!

    I own a few Raspberry Pis, but the Model 3B and 4B are currently in use. The lone unused Pi is an old Model B (i.e., model 1B), so installing something like Nagios would be, well, not usable when it was all said and done. Given that the Raspberry Pi is at home with Python, I thought I would dust off my “language learning skills” and figure out how to make something useful.

    As I started, though, I remembered my free version of Atlassian’s Status Page. Although the free version is limited in the number of subscribers and no text subscriptions, for my usage, it’s perfect. And, near and dear to my developer heart, it has a very well defined set of APIs for managing statuses and incidents.

    So, with Python and some additional modules, I created a project that lets me do a quick request on a desired website. If the website is down, the Status Page status for the associated component is changed and an incident is created. If/when it comes back up, any open incidents associated with that component are closed.

    Viola!

    After a few evening hours tinkering, I have the scripts doing some basic work. For now, a cron job executes the script every 5 minutes, and if a site goes down it is reported to my statuspage site.

    Long term, I plan on adding support for more in-depth checks of my own projects, which utilized .Net’s HealthChecks namespace to report service health automatically. I may also look into setting up the scripts as a service running on the Pi.

    If you are interested, the code is shared on Github.

  • Hardening your Kubernetes Cluster: Don’t run as root!

    People sometimes ask my why I do not read for pleasure. As my career entails ingesting the NSA/CISA technical report on Kubernetes Hardening Guidance and translating it into actionable material, I ask that you let me enjoy hobbies that do not involve the written word.

    The NSA/CISA technical report on came out on Tuesday, August 3. Quite coincidentally, my colleagues and I have started asking questions about our internal standards for securing Kubernetes clusters for production. Coupled with my current home lab experimentation, I figured it was a good idea to read through this document and see what I could do to secure my lab.

    Hopefully, I will get through this document and how I’ve applied everything to my home lab (or, at least, the production cluster in my home lab). For now, though, I thought it prudent to look at the Pod Security section. And, as one might expect, the first recommendation is…

    Don’t run as root!

    For as long as I can remember working in Linux, not running as root was literally “step one.” So it amazes me that, by default, containers are configured to run as root. All of my home-built containers are pretty much the same: simple docker files that copy the results of an external dotnet publish command into the container and then run the dotnet entry point.

    My original docker files used to look like this:

    FROM mcr.microsoft.com/dotnet/aspnet:5.0-focal AS base
    WORKDIR /app
    COPY . /app
    
    EXPOSE 80
    ENTRYPOINT ["dotnet", "my.dotnet.dll"]

    With some StackOverflow assistance, now, it looks something like this:

    FROM mcr.microsoft.com/dotnet/aspnet:5.0-focal AS base
    WORKDIR /app
    COPY . /app
    # Create a group and user so we are not running our container and application as root and thus user 0 which is a security issue.
    RUN addgroup --system --gid 1000 mygroup \
        && adduser --system --uid 1000 --ingroup mygroup --shell /bin/sh nmyuser
      
    # Serve on port 8080, we cannot serve on port 80 with a custom user that is not root.
    ENV ASPNETCORE_URLS=http://+:8080
    EXPOSE 8080
      
    # Tell docker that all future commands should run as the appuser user, must use the user number
    USER 1000
    
    ENTRYPOINT ["dotnet", "my.dotnet.dll"]

    What’s with the port change?

    The docker file changes are pretty simple: add the commands to add a new group and a new user, and using the USER command, tell the docker file to execute as the new user. But why change the ASPNETCORE_URLS and port change? When running as a non-root user, you are restricted to ports above 1024, so I had to change the exposed port. This necessitated some changes to my helm charts and their service deployment, but, overall, the process was straightforward.

    My next steps

    When I find some spare time in next few months, I’m going to revisit Pod Security Policies and, more specifically, it’s upcoming replacement: PSP Replacement Policy. I find it amusing that the NSA/CISA released guidelines that specify usage of what is now a deprecated feature. I also find it typical of our field that our current name for the new version is, quite simply, “Pod Security Policy Replacement Policy.” I really hope they get a better name for that…

  • Moving the home lab to Kubernetes

    Kubernetes has become something of the standard for the orchestration of containers. While there are certainly other options, the Kubernetes platform remains the most prevalent. With that in mind, I decided to migrate my home lab from docker servers to Kubernetes clusters.

    Before: Docker Servers

    Long story short: my home lab has transitioned from Windows servers running IIS to a mix of Linux and Windows containers to Linux only containers. The apps are containerized, but the DBs still run on some SQL servers.

    Build and deployment is automated: Build through Azure DevOps Pipelines & Self Hosted Agents (Teamcity before that), and deployment through Octopus Deploy. Container images for my projects live on a Proget server feed.

    The Plan

    “Consolidate” (and I’ll tell you later why that is in quotes) my servers into Kubernetes Clusters. It seemed an easy plan.

    • Internal K8 Cluster – Runs Rancher and any internal tooling (including Elastic/Kibana) I want to be there, but not available externally
    • Non Production K8 Cluster – Runs my *.net and *.org sites, used for test and staging environments
    • Production K8 Cluster – Runs my *.com sites (external) including any external tooling.

    I spent some time learning Packer to provision Hyper-V vms for my clusters. The clusters all ended up with a control plane (4vCPU, 8GB RAM) and two workers (2vCPU, 6GB RAM).

    The Results

    The Kubernetes Clusters

    There was a LOT of trial and error in getting Kubernetes going, particularly with Rancher. So much, in fact, that I probably provisioned the clusters 3 or 4 times each because I felt like I messed up and wanted to do it over again.

    Initially, I tried to manually provision the K8 cluster. Yes, it worked.. but RKE is nicer. And, after my manually provisioned K8 cluster went down, I provisioned the internal cluster with RKE. That makes updates easier, as I have the config file.

    I provisioned the non-production and production clusters using the Rancher GUI. However, that was the “manually provisioned” cluster, so, when it went down, I lost the config files. I currently have two clusters which look like “imported” clusters in Rancher, so they are harder to manage through the Rancher GUI.

    Storage

    In order to utilize persistent volume claims, I configured NFS on my Synology and installed the nfs-subdir-external-provisioner in all of my clusters. It installs a storage class which can be used in persistent volume claims, and will provision directories in my NFS.

    Ingress

    Right now, I’m using the Nginx Ingress controller from Rancher. I haven’t played with it much, other than the basics. Perhaps more on that when I dig in.

    Current Impressions

    Rancher

    It works… but mine is flaky. I think it may be due to some resource starvation. I may try to provision a new internal cluster with better VMs and see how that works.

    I do like the deployment of clusters using RKE, however, I can see how it would be difficult to manage when there is more than one person involved.

    Kubernetes

    Once it was running, it’s great: creating new APIs or apps and getting them running in a scalable fashion is easy. Helm charts make deployment and updating a snap.

    That said, I would not trust myself to run this in production without a LOT more training.

    References

  • Ubuntu and Docker…. Oh Snap!

    A few months ago, I made the decision to start building my my .NET Core side projects from as Linux-based containers instead of Windows-based containers.  These projects are mostly CRUD APIs, meaning none of them require the Windows based containers.  And, quite frankly, Linux is cheaper….

    Now, I had previously built out a few Ubuntu servers with the express purpose of hosting my Linux containers, and changing the Dockerfile was easy enough.  But I ran into a HUGE roadblock in trying to get my Octopus deployments to work.

    I was able to install Octopus Tentacles just fine, but I could NOT get the tentacle to authenticate to my private docker repository.  There were a number of warnings and errors around the docker-credential-helper and pass, and, in general, I was supremely frustrated.  I got to a point where I uninstalled everything but docker on the Ubuntu box, and that still didn’t work.  So, since it was a development box, I figured there would be no harm in uninstalling Docker…  And that is where things got interesting.

    When I provision my Ubuntu VMs, I typically let the Ubuntu setup install docker.  It does this through snaps, which, as I have seen, have some weird consequences.  One of those consequences seemed to be a weird interaction between docker login and non-interactive users.  The long and short of it was, after several hours of trying to figure out what combination of docker-credential-helper packages and configurations were required, removing EVERYTHING and installing Docker via apt (and docker-compose via a straight download from github) made everything work quite magically.

    While I would love nothing more than to recreate the issue and figure out why it was occurring, frankly, I do not have the time.  It was easier for me to swap my snap-installed Docker packages for apt-installed ones and move forward with my project.

  • Polyglot-on for Punishment

    I apologize for the misplaced pun, but Polyglot has been something of a nemesis of mine over the last year or so. The journey started with my Chamberlain MyQ-compatible garage door and a desire to control it through my ISY. Long story short (see here and here for more details), running Polyglot on a docker image on my VM docker hosts is more of a challenge than I’m willing to tackle. Either it’s a networking issues between the container and the ISY or I’m doing something wrong, but in any case, I have resorted to running Polyglot on an extra Raspberry Pi that I have at home

    Why the restart? Well, I recently renovated my master bathroom, and as part of that renovation i installed RGBW LED strips controlled by a MiLight controller. There is a node server for the MiLight WiFi box I purchased, so, in addition to the Insteon switches for the lights in that bathroom, I can now control the electronics in the bathroom through my ISY.

    While it would be incredibly nice to have all this running in a docker container, I’m not at all upset that I got it running on the Pi. My next course of action will be to restart the development of my HA Kiosk app…. Yes, I know there are options for apps to control the ISY, but I want a truly customized HA experience, and for that, I’d like to write my own app.

  • Supporting Teamcity Domain Authentication

    TLDR: TeamCity in Linux (or in a Linux Docker container) only supports SMBv1. Make sure you enable the SMB 1.0/CIFS File Sharing Support feature on your domain controllers.

    A few weeks ago, I decided it was time to upgrade my domain controllers. On a Hypervisor with space, it seemed a pretty easy task. My abbreviated steps were something like this:

    1. Remove the backup DC and shut the machine down.
    2. Create a new DC, add it to the domain, and replicate
    3. Give the new DC the master roles
    4. remove the old primary DC from the domain and shut it down
    5. Create a new backup DC and add it to the domain.

    Seems easy, right? Except that, during step 4, the old primary DC essentially gave up. I was forced to remove it from the domain manually.

    Also, while i was able to change the DHCP settings to reassign DNS servers for the clients which get their IP via DHCP, the machines with static IP addresses required more work to reset the DNS settings. But, after a bit of a struggle, I got it working.

    Except that I couldn’t log in to TeamCity using my domain credentials any more. I did some research, and, on Linux, TeamCity only supports SMBv1, not SMB2. So I installed the SMB 1.0/CIFS File Sharing Support feature on both domain controllers and that fixed my authentication issues.