Tuesday, October 1, 2019

Don't just visualize uncertainty; explain it and don't let captions contradict it

MIT Election Lab's Alexander Agadjanian has a nice piece in The New York Times about how people react when reading that the Democratic party may be shifting left.

The graphs in the article include error bars which, considering that they are based on a survey experiment, I guess correspond to a 95% confidence interval. I had to guess because it's not explained anywhere.

I'm in favor of disclosing uncertainty in visualizations. I've written about it repeatedly in books and blog posts. However, I also think that uncertainty should never go unexplained, particularly if we present it to readers who may not understand what the whiskers on either side of the point estimate dots mean.

I also think that we journalists shouldn't let what we write contradict what we visualize. On the first graph the error bars for Independents—the only data point that seems statistically significant—are very wide, but the caption reads “independents in a survey were six percentage points less likely to say they would vote for a Democrat in 2020, compared to a control group” (that's the 0 baseline.)

That isn't wrong per se, but I conjecture that in the mind of many readers it makes the point estimates sound much more precise than they really are. If we display uncertainty, we should convey uncertainty also through our annotations, so instead of writing “six percentage points less likely,” I'd suggest “significantly” or “considerably less likely,” without assigning any specific value to the difference.

UPDATE: Kaiser Fung has written about this visualization. Don't miss it.