# The importance of uncertainty

I just came from the lab with an amazing new discovery, one that will change the landscape of fruit-based research: bananas get you jobs. That’s right, I’ve uncovered evidence that studying bananas in your graduate years significantly improves your post-graduation salary. Don’t believe me? Check out this great bar graph I put together:

That’s right – the proof is in the pudding. Bananas increase your wage. Case closed.

“But wait”, you might say, “not only is this argument completely stupid, but your graph is totally meaningless.”

And you would be right, but stuff like this happens in the media, on the web, even in academia all the time. Setting aside for the moment the numerous problems with the above argument, there’s one in particular that takes the cake in terms of it’s common (mis)use. I’m speaking, of course, about error bars.

# Uncertainty can answer questions too!

Many people shrug off the use of error bars as an academic quirk for those hell-bent on finding tiny flaws in other people’s arguments, I’m here today to tell you that those people are wrong. Error bars are important. Really important. For example, let’s collect some more data for our experiment above and build some confidence intervals:

Whoops! OK, so maybe you shouldn’t switch your research over to banana migration patterns just yet. Clearly there’s nothing interesting at all here (and in fact this data was generated by picking random numbers between 1 and 100,000). As you can see, once I tell you the uncertainty in my data, my argument becomes much less convincing. In fact, this is one of the primary reasons that error bars are great: they make it much harder to make bogus claims.

As I see it, there are a few main reasons why someone wouldn’t use error bars any time they present data-driven information:

1. They don’t know what error bars are (in which case you should stop presenting data in your arguments and go find an online statistics class)

2. They are willingly hiding them because it makes their data look “messy” (in which case you should think really hard about why your error bars look so bad in the first place)

3. They think that error bars will confuse people with unnecessary information (in which case you should give people more credit)

4. They are willingly hiding them because they’re intentionally trying to mislead people (in which case you’re probably a politician. I kid, I kid)

# The balance between uncertainty and truth

So why am I being so overzealous about errorbars in the first place? Because without any indication of your confidence in a number, that number is meaningless. In statistics, we call this a “point estimate”. It gives you a single value for something of interest, but gives you no context about that value’s possible range, its stability across multiple sets of data, or its difference with respect to other values you compare it to.

Let’s look at some more data:

Hmmm, well, doesn’t seem to be much interesting here, only a difference of .003. Clearly, your brain shows no response to eating bananas. Or does it?

Once again, we can’t interpret this graph without error bars. Let’s run the experiment a few more times and see what we get:

Ah ha – in this case, by showing the uncertainty in our numbers, we’ve revealed that there actually does seem to be a difference between these conditions. Their absolute difference may be quite small, but if we can show that this tiny difference still exists even after many rounds of data collection, perhaps it’s worth exploring further. That’s where error bars come in.

# Towards an uncertain future

We are often taught that uncertainty is a bad thing – that it reflects weakness of character, a lack of enough data, a “wishy-washy” attitude, or an ineffective research approach. In reality, uncertainty is the most important thing that we have. It tells us not only what we believe about the world, but also how reasonable those beliefs are in the first place.

When making arguments for the research world, explaining your uncertainty is crucial in order to be honest about your methods and your data. When making arguments to the general public, it’s essential for relating whether your information is worth believing in the first place.

As the world becomes increasingly “data-centric” it will become even more important for people to push for best-practices in data presentation.This means more error bars. So moving into the future, what should we do?

1. Include error bars any time you use data to make an argument. I don’t care whether that argument is for a group of tenured professors or a group of second graders. People need to expect error bars and uncertainty as a typical part of any argument. And this doesn’t just have to apply to numbers either. With data, we happen to have a way to quantify just how uncertain we are, but without numbers the importance of uncertainty is just as strong.  Tell your audience how confident you are in your assertions, make it clear that you’ve considered possible alternatives, don’t only tell the audience what you know, but why they should believe you.
2. You, the general reader, need to demand to know the level of confidence or uncertainty in any kind of publication that you read. If they don’t show error bars, then don’t believe their statement. If they refuse to give you error bars, then they’re probably trying to hide something.
3. Learn to love uncertainty. Really. It’s what makes the world go ‘round, it’s what keeps things interesting. Sure, the unknown is often frightening, but with uncertainty comes an infinite wealth of possibilities that are just around the corner. Moreover, it’s something that everybody feels, regardless of their expertise. Don’t listen to the folks telling you that they’re 100% sure of anything. They’re not.

Ultimately, it’s our uncertainty about the world that drives humans to discover new ideas, invent new tools, and get to the bottom of things. Dealing with uncertainty is one of the most human endeavors, and knowing how to think about uncertainty makes you incredibly powerful in making decisions. So the next time that your co-worker tells you that they heard about a really interesting article that said we should all start studying bananas, make sure to ask them “are you sure about that?”

1. #### David Lane

Unfortunately, error bars around means rarely provide the key information required to understand the relevant uncertainty. In the vast majority of cases, it is the difference between means rather than the values of the individual means themselves that is of interest. Most readers don’t understand the relationship. One common misconception is that if the error bars do not overlap then the differnce is significant. Another misconception Is that if confidence intervals overlap then the difference is not significant.

Putting error bars around differences between means would be valuable, but it is rarely done.

• #### Chris Holdgraf

Excellent point! For those interested in this topic, you should check out this short paper.

2. #### Anonymous

Chris, you are an excellent writer. You make reading about statistics fun. That is a feat. Please continue to write about stats concepts, as your posts are valued.

3. #### hank

Thanks for posting the really interesting and helpful articles