Web Analytics Revisited | Amadeus Maximilian

A little over three years ago, I set up my very first self-hosted analytics service to collect a bit of data on how many people were actually using my apps. Back then, I went with Umami, a free and open source web analytics service that can be self-hosted.

Unfortunately, as time went on, I was struggling to build each new version on my hosting provider of choice, Uberspace, because the process was simply using too much RAM. So after my instance finally broke in the middle of February, I started looking for alternatives in earnest.

Why track in the first place?

When I turned to Mastodon to ask around for privacy-respecting analytics services, one of the most common answers was “just don’t use any analytics”—and I agree, that’s certainly the least invasive option.

It’s also the least useful, at least in some cases. You see, while I do agree that knowing whether this blog is read by a thousand people or just a single person really doesn’t matter, knowing how many people regularly use an app I’m planning on discontinuing does.

It’s also important for me to know which features within apps people rely on, so I can prioritise fixing bugs and adding new features. I have very limited time and a lot of ideas, so having an inkling of knowledge about what to focus on goes a long way.

And last, but certainly not least, I’ve been trying to get more and more people to build an online presence away from the big (social media) companies, so they can stay in control of their content. These people are used to the dopamine hits they get from the various counters in their favourite apps, be it likes or views.

For them, being able to open a nice looking dashboard and seeing their popularity graphed before them is surprisingly essential. They don’t really care about the screen size statistics and bounce rates, they might not even understand what the latter means. They do get a kick out of knowing that a hundred people looked at their newest post, though.

So being able to provide them that experience alongside their new blog or website masters and can improve their willingness to free themselves from their corporate overlords.

Looking at the Options

With that in mind, I hope it’s clear that I’m not looking for huge invasive tracking with heat maps and an abundance of detailed metrics. A good analytics service for my use case should offer the following options:

It should be self-hostable
It should collect minimal data
It should be open source
It should provide a nice UI for non-technical users
It should be easy to manage and maintain

These parameters already disqualify a lot of the commercial options and solutions like Matomo and its derivatives, as those are built to replace Google Analytics and collect tons of metrics beyond the basics.

That leaves a world of “simple” analytics which have sprung up in recent years. Names like “Plausible”, “Fathom”, “Simple Analytics” all offer similar UIs and features that are more in line with my requirements. Many of these commercial products also have self-hostable versions, which come with some caveats. So I kept looking.

GoatCounter

One of the services which was recommended to me on Mastodon was GoatCounter, a simple analytics service built in Go. There are pre-built binaries available, so self-hosting it was an absolute breeze and welcome respite after my struggles with Umami.

I also liked its approach to tracking, with a focus on minimal data collection and privacy at the core.

Unfortunately, opening up my freshly hosted instance was where my positive experience ended. Despite reading about the design philosophy beforehand, I just couldn’t get accustomed to the way the application looks and behaves.

Options are often cryptic or in unintuitive places, the colours, and placement of UI elements reminds me of days long gone by. Granted, a lot of this is a matter of personal taste, but considering I was wanting to hand things off to non-developers eventually, it simply was too hard to use and enjoy.

What I found worse, however, is how the application handles multiple different websites. You either have to track everything under a single domain, using prefixes on the paths to distinguish between different projects, or have to create and register a new subdomain pointed at the same instance of GoatCounter for each project.

Depending on how you’re self-hosting the app, that might be doable. In my case, however, that would’ve meant not only creating tons of subdomains but also registering them and adding them to my web server configuration so they would be routed to the right place.

Despite this, I briefly entertained the idea of building an alternative frontend for GoatCounter, powered by its API. That way, I could make it look and feel exactly the way I wanted. Sadly, that plan was nipped in the bud as well, because I had to find out that there was no way to properly enable CORS, making the API effectively useless from a browser.

Ackee, Counter.dev

There were two other services on my list which I considered. One of them was Ackee, a fresh take on analytics I had already played with a couple of years back. I still enjoy its approach and look-and-feel, however, it doesn’t support multiple users and hasn’t been updated in years.

Counter.dev looks equally unmaintained, despite seeming like a very cool option. Its design is playful and friendly, appealing to the non-technical people I showed it to. It’s also built in Go and thus should be easily self-hostable, but in the end I decided against trying it because the last commit to the self-hosted version was two years ago.

Plausible & Co.

Many other services I came across during my research, including Plausible, my go-to suggestion for commercial clients, also have the option to be self-hosted. However, they either severely reduce the feature set of that open-source version as is the case with Fathom, or only provide a Docker image and have quite steep hardware requirements to run, as seen with Plausible.

Unfortunately, Uberspace doesn’t support Docker, so I couldn’t even try these other options. In any case, I tend to think that requiring a beefy CPU and more than 2 GB of RAM just for a simple analytics service is a bit steep.

Full Circle

In the end, I returned to Umami. It is being very actively developed, gaining a lot of additional tracking features. This makes me worry that it’ll keep ballooning and eventually end up growing way too bloated and complex, but for now it’s the best option for my use case.

It’s simple, can be used to collect very minimal data, it’s fully open source, it has a nicely designed UI that can be understood by anyone. In theory, it’s also reasonably easy to self-host and maintain. In practice, I don’t understand why a simple web application can use more than 1 GB of RAM during build, but I guess that’s a symptom of using Next.js and React.

So instead of trying to find (or foolishly trying to build) another analytics service that was just like Umami, but used fewer resources during build, I decided to instead spend some time to optimise the build process for Umami on Uberspace.

Like I had initially presumed, the issue is less with Umami itself but rather with Next.js, so I was able to find some pointers in the right direction. Apparently, Next.js tries to spawn as many worker threads as possible so it can generate pages in parallel. I’m guessing this happens based on the number of (virtual) CPU cores present on a device, which on a shared server could be in the hundreds. Each of these threads needs to read data and that accumulates in RAM, causing it to exceed the account limit, at which point the parent process gets killed and the build aborts.

This spawning of worker threads is also the reason the official suggested solution to limit RAM usage with NODE_OPTIONS=--max_old_space_size=<Limit> doesn’t seem to work—as each of the worker thread would get access to up to <Limit> amount of system memory. This is all theory, of course, built on my limited understanding of Next, but it makes sense in my head.

Especially considering the solution I found to get Umami to build on Uberspace involved adding two poorly documented properties to next.config.js:

experimental.cpus: 1
experimental.workerThreads: false

Once these options were in place, I was able to build the app without issues and didn’t even notice any substantial increase in build-time. Additionally, I also experimented with experimental.webpackMemoryOptimizations, but that didn’t seem to have an effect at all.

Admittedly, this solution has worked exactly once so far, but it also never failed. 😉 I will keep an eye on it during the next updates and if it proves to be reliable, make sure to contribute back my findings to both the Umami and Uberspace documentation.

For the time being, I’ll be enjoying a working and updated Umami instance trying to distract myself from an annoying, much more physical issue—but more on that in a different post. As always, thank you for reading! And if you have any suggestions or thoughts about analytics services, Umami, or my “fix” for Next.js builds, feel free to reach out to me on Mastodon.

I am not affiliated with Umami in any way, nor was I asked to write or received any form of compensation for writing this article. As such, all my experiences and opinions about it are my own.