Speed. I.. am.. Speed.

I have heard the opening to the Cars movie more times than I can count, and Owen Wilson’s little monologue always sticks in my head. Tangentially, well, I recently logged in to Xfinity to check my data usage, which sent me down a path towards tracking data usage inside of my network. I learned a lot about what to do, and what not to do.

We are using how much data?

Our internet went out on Sunday. This is not a normal occurrence, so I turned off the wifi on my phone and logged in to Xfinity’s website to report the problem. Out of sheer curiosity, after reporting the downtime, I clicked on the link to see our usage.

22 TB… That’s right, 22 terabytes of data in February, and approaching 30TB for March. And there were still 12 days left in March! Clearly, something was going on.

Asking the Unifi Controller

I logged in to the Unifi controller software in the hopes of identifying the source of this traffic. I jumped into the security insights, traffic identification, and looked at the data for the last month. Not one client showed more than 25 GB of traffic in the last month. That does not match what Xfinity is showing at all.

A quick Google search lead me to a few posts that suggest that the Unifi’s automated speed test can boost your data usage, but that it doesn’t show on the Unifi. Mind you, these posts were 4+ years old, but I figured it was worth a shot. So I disabled the speed test in the Unifi controller, but would have to wait a day to see if the Xfinity numbers changed.

Fast forward a day – No change. According to Xfinity I was using something like 500GB of data per day, which is nonsense. My previous months were never higher than 2TB, so using that much data in 4 days means something is wrong.

Am I being hacked?

Thanks to “security first” being beat into me by some of my previous security-focused peers, the first thought in my head was “Am I being hacked?” I looked through logs on the various machines and clusters, trying to find where this data was coming from and why. But nothing looked odd or out of the ordinary: no extra pods running calculations, no servers consuming huge amounts of memory or CPU. So where in the world was 50TB of data coming from?

Unifi Poller to the Rescue

At this point, I remembered that I have Unifi Poller running. The poller grabs data from my Unifi controller and puts it into Mimir. I started poking around the unpoller_ metrics until I found unpoller_device_bytes_total. Looking at that value for my Unifi Security Gateway, well, I saw this graph for the last 30 days:

unpoller_device_bytes_total – Last 30 Days

The scale on the right is bytes, so, 50,000,000,000,000 bytes, or roughly 50TB. Since I am not yet collecting the client DPI information with Unifi Poller, I just traced this data back to the start of this ramp up. It turned out to be February 14th at around 12:20 pm.

GitOps for the win

Since my cluster states are stored in Git repos, any changes to the state of things are logged as commits to the repository, making it very easy to track back. Combing through my commits for 2/14 around noon, I found the offending commit in the speedtest-exporter (now you see the reference to Lightning McQueen).

In an effort to move off of the k8s-at-home charts, which are no longer being maintained, I have switch over to creating charts using Bernd Schorgers’ library chart to manage some of the simple installs. The new chart configured the ServiceMonitor to scrape every minute, which meant, well, that I was running a speed test every minute. Of every day. For a month.

To test my theory, I shut down the speedtest-exporter pod. Before my change, I was seeing 5 and 6 GB of traffic every 30 seconds. With the speed test executing hourly, I am seeing 90-150 MB of traffic every 30 seconds. Additionally, the graph is much more sporadic, which makes more sense: I would expect traffic to increase when my kids are home and watching TV, and decrease at night. What I was seeing was a constant increase over time, which is what pointed me to the speed test. So I fixed the ServiceMonitor to only scrape once an hour, and I will check my data usage in Xfinity tomorrow to see how I did.

My apologies to my neighbors and Xfinity

Breaking this down, I have been using something around 1TB of bandwidth per day over the last month. So, I apologize to my neighbors for potentially slowing everything down, and to Xfinity for running speed tests once a minute. That is not to say that I will stop running speed tests, but I will go back to testing once an hour rather than once a minute.

Additionally, I’m using my newfound knowledge of the Unifi Poller metrics to write some alerts so that I can determine if too much data is coming in and out of the network.