It's been a windy weekend here. I had to squeeze in some windsurfing plus some work on matplotlib-journey.com, so I'm pretty late for this issue 🙈.
But I couldn’t let the weekend slip by without delivering your weekly dose of dataviz tips.
This week, let's tackle a common culprit that often makes my hair stand on end:
Error bars
I'm sure you know those little lines you often see at the top of bars in a chart. They’re meant to convey uncertainty and add context to the data. Without them, the size of a bar can be misleading.
But, error bars come with two major pitfalls that can make them worse than useless: they can mislead you.
👻 Problem 1: Hiding the underlying distribution
Error bars only give a summary—they don’t show the full story behind the data.
Take a look at this figure: the same error bars could represent wildly different data patterns:
A low sample size.
A bimodal distribution.
An outlier that skews the mean.
Error bars hide the underlying distribution. Source.
Error bars hide these nuances.
If you know the underlying distribution of your data, visualize it!
Adding jittered points, a violin plot, or even a histogram alongside your bars can reveal the truth.
🤔 Problem 2: What do the error bars mean?
Error bars can represent:
Confidence intervals (e.g., 95% confidence).
Standard error (SE, a measure of variability in the sample mean).
Standard deviation (SD, variability in the dataset).
Now look at this example. It’s the same dataset, but the error bars are wildly different depending on what they represent:
Error bars can have very different meanings. Source.
If you don’t specify what the error bars show, they’re meaningless—and worse, they’ll confuse your audience. Note: math details and R code available here.
That’s it for this week! Hopefully, this tip will come in handy next time you or a colleague are adding error bars to a chart. 🙂
Good luck with the week ahead!
Yan
PS: I'm slowly leaving twitter (X) in favour of LinkedIn and Bluesky. Please connect with me over there 🙂 PPS: I am very proud of what I've built with Joseph Barbier on Matplotlib Journey last week. It's a circle packing chart that shows the architecture of matplotlib, the most famous lib for dataviz in Python. Take a look and click for the interactive version!
A circle packing chart showing the organization of Matplotlib. Link.