For the last 3 years, I’ve been running both client-side analytics and server-side analytics on my site and I’ve noticed some alarming discrepancies. I thought I was alone until Jim Nielsen posted his comparison Comparing Data in Google and Netlify Analytics. My data story and Jim’s data story end up similar (we’re both popular in Germany?); a situation where the more data you have, the less the data seems true.

The data

Here’s my data from Mar 25th, 2022 ~ April 25th, 2022. The data collection methods differ like so:

  • Fathom is a client-side script served from a self-hosted, first-party subdomain (stats.daverupert.com)
  • Netlify processes raw server log data.

I think these are both great products that you should totally use over other alternatives.

Total visits

Fathom (Client-side) Netlify (Server-side)
Unique visitors 12,300 26,085
Pageviews 18,000 333,801

These couldn’t be more different. 2.1× the visitors, 18.5× the pageviews. Obviously the big numbers are the real numbers. 😉

Top content

Rank Fathom (Client-side) Views Netlify (Server-side) Views
1 /2022/04/7-web-component-tricks/ 4,483 /2014/01/4k-rwd/ 101,421
2 / 3,560 /2013/04/responsive-deliverables/ 65,491
3 /2022/04/what-if-everything-got-better/ 1,305 / 62,486
4 /about/ 665 /offline 17,202
5 /2022/04/what-if-everything-got-better/?ref=sidebar 467 /2014/01/4K-RWD/ 16,686
6 /archive/ 422 /2022/04/7-web-component-tricks/ 8,433
7 /2012/05/making-video-js-fluid-for-rwd/ 331 /2022/04/what-if-everything-got-better/ 7,473
8 /2022/04/my-weekly-engineering-report/ 292 /archive/ 2,106
9 /2022/04/vibe-check-15/ 284 /about/ 1,605
10 /2022/04/productivity-sniped-by-para/ 265 /2022/04/vibe-check-15/ 1,412

As much as I’d love to believe 100,000+ people show up each month to read my 2014 banger about the upcoming advent of 4K displays and its impact on responsive design… I highly doubt that. Even stripping those outliers from the results in the table above, the top 10 posts from each service return wildly different traffic numbers.

Doing some napkin math comparisons…

URL Discrepancy
/ 17.5×
/about/ 2.4×
/archive/ 4.9×
/2022/04/7-web-component-tricks/ 1.8×
/2022/04/what-if-everything-got-better/ 4.2×
/2022/04/vibe-check-15/ 4.9×

The discrepancies are not consistent at all. If these were similar, we might be able to say 75% of people use ad blockers and believe the large numbers, but it’s not so cut and dry.

Top locations

Country Pageviews
Germany 75,359
Finland 70,605
United States 60,031
Canada 45,305
China 28,092
France 13,345
UK 5,960

My version of Fathom doesn’t give me country data, but I’m including my Netlify data here which I’ll use to make a point later. You may have noticed something is a little fishy about this data.

Before I jump to any conclusions, I wanted to check my data with their sources.

Feedback from Fathom and Netlify

I tweeted about this phenomenon in 2019…

After tweeting about it Netlify and Fathom both reached out to me. It’s a bit of a he-said/she-said situation about who’s data is most correct, but some high-level notes add value to the conversation.

  • Fathom’s point of view was that I was on the outdated community product and the SaaS product is much better and more accurate now. I don’t doubt the paid product is better, but I do doubt it would close the 18× gap on the Netlify numbers.
  • Netlify’s take was pretty matter-of-fact. “Yup. This reflects the log data.” They’re not wrong, raw server logs don’t lie. Their investigation into the matter turned up two anomalies I can’t solve:
    • 80% of the hits to my Responsive Deliverables post contain a utm_campaign= query param, which tells me people love my footnote about SMACSS. Some bot (in Germany?) must be hammering that campaign URL.
    • There’s a Google Cloud Uptime Monitor health check pointed at my site. This is helpful, except that I never set one up! Another rogue bot (in Germany?) is pinging my old content thousands of times a day.

Fathom’s numbers seem more believable but Netlify’s data eliminates any ad blocking questions. My one small takeaway is that it’d be nice if I could filter out known bot traffic from Netlify’s reporting. That might remove the delta to an acceptable degree.

Big conclusions about data-driven decision making

The data tells me I get somewhere between 12k and 26k visitors to my site who open anywhere between 18k and 333k pages. If I stare at the larger numbers, that is impressive but not actionable. Comparing 2019 to 2022, Netlify is showing a 25% drop in pageviews and a 41% drop in uniques, while Fathom is showing almost the inverse.

This isn’t a judgement on Netlify or Fathom, they are both incredible services that offer incredible analytics products. But it leaves me in a position of wondering who should I believe?

My trust level in analytics products is low

My trust in analytics data is at an all-time low. Browser-level privacy improvements, ad blockers, and bot traffic have obliterated the data integrity. Unless I’m a certified statistician, I’m unable to accurately assess traffic to my breakfast table.

I used to believe people were honest, but seeing all this bot traffic on my little site I could see how someone might point bots at their own site to inflate their traffic stats, gain perceived popularity, which they then pass on to advertisers. My ad vendor’s FAQ page has a note that says “Bots account for as much as 80% of website traffic”, yikes.

Data-driven decisions can be a blindspot

If I believed Netlify’s data as gospel, I would be a blogging sensation! I should quit my job and do blogging fulltime. Four million eyeballs per year. Following Netlify’s data to its logical extreme, I should be structuring my new blogging business like so:

  • Writing a lot of content about responsive design and 4K monitors.
  • Investing a lot in German and Finnish localized content.
  • I’d should be in full panic mode about my 41% drop in uniques.

I love Germany (and I hear Finland is great), but I suspect they’re not my #1 and #2 markets. And those vintage 2014 posts are okay, but should not be the core of my content strategy.

If I, or some hypothetical manager, put too much stock into these metrics I could see it causing a firestorm of reprioritization based on bot traffic1. We’d be chasing the tail of a health check bot somewhere in Germany.

The point is, you can enter a McNamara Fallacy if you base your business solely on quantitative data.

You need more than just metrics

Probably old advice, but you need a mixture of quantitative AND qualitative data to make good decisions. In Just Enough Research, Erika Hall drives home a point that I think applies here:

You want to know what is happening (qualitative), how much it’s happening (quantitative) and why it’s happening (qualitative).

If your goal is to grow your product, you need a mixture of research (academic and applied), user tests, A/B tests, analytics, performance audits2, accessibility audits3, and a plethora of other considerations. Even if you have all those processes in place, I still think you need a human — a sage — who can read the tea leaves and interpret the data in a way that the business understands.

If your goal is not to grow your site… maybe ignorance is bliss.

  1. Maybe pointing a bot army at a competitor’s site to skew metrics is some next level Art of War shit.

  2. Every 100ms of latency costs Amazon 1% of sales (source) /via WPO Stats

  3. 1 in 4 adults in the United States have a disability (source) /via A11y Project