😳 How to lose your job in 1 dataviz mistake

👋 Hi!

Imagine this: you’re at work, and your boss drops a new dataset on your desk.

It’s straightforward—just a list of how long 400,000 people took to run a marathon.

The ask? Plot the distribution of those finishing times.

Too easy

In French, we’d say "finger in the nose" (though that doesn’t quite translate well into English).

Anyway, I’m sure you know how to make a histogram, and if not, I’ve written plenty of tutorials using R, Python, or JavaScript to guide you 🎉.

A few minutes later, you end up with a neat histogram that looks like a almost normal distribution:

Distribution of time spent running a marathon for 400,000 people

Job done!

Yes but!

You might think the job is done already.

But there’s a big problem here.

Can you spot it?

Take a moment to think before scrolling! Here is an image to make sure you do not see the answer right away.

I used to go surfing there when I lived in Brisbane, Australia. (It's snapper rock, one of the most famous wave on the planet 🙂)

Did you miss the key story?

By changing the bin size, a whole new story emerges.

Suddenly, distinct spikes appear around 3:00, 3:30, and 4:00 hours.

Why?

Because these are popular time goals for marathon runners, and they push hard to finish at these times. So, it’s much more common for someone to finish around 4:00 than at, say, 4:05.

With smaller bin size, 3 breaks are revealed in the histogram!

If you want to explore this yourself, I’ve created an interactive version of the chart where you can adjust the bin size with a slider.

Interactive version with slider

This discovery is so fascinating that even The New York Times wrote an article about it.

Avoid common pitfalls

A great way to improve your data visualization skills is by avoiding common pitfalls like this one.

I’ve put together a collection of these pitfalls and plan to grow it even further. Each one will have its own flashcard, summarizing the context.

Before I move forward, I’d love your feedback on the design and concept of the first card:

I'm about to create many dataviz flashcards like this one!

Can cards like this be useful? Like the design of this one? I would be so grateful if you tell me what you think! 🙏🙏🙏

See you next Saturday!

Yan Holtz

Find me on X, LinkedIn, or check my Homepage

👋 By the way, here is how I can help!

Master R: Join my productive R workflow online course, already helping hundreds to excel in R, Quarto, and GitHub.
Team Training: Hire me to train your team on Data Visualization and Programming.
Engaging Talks: Book me for short, impactful talks on Data Visualization and Programming.

Check yan-holtz.com or hit reply any time! I love hearing from you.

https://preview.convertkit-mail2.com/unsubscribe
Unsubscribe · Preferences

😳 How to lose your job in 1 dataviz mistake

Too easy

Yes but!

Did you miss the key story?

Avoid common pitfalls

Subscribe to Dataviz Universe