Sunday, April 14, 2013

Making your message visible: Trend lines in scatter plots

If you teach infographics and visualization, here's an example to use when explaining the differences and similarities between designing for your peers (or for analyzing your data) and designing to communicate with broader, nonspecialized audiences. An hour ago, while reading The New York Times*, I came across a lovely scatter plot, the first picture above this paragraph. I read its headline and deck: "Mirth and Taxes. A study of 54 nations (...) found that those with more progressive tax rates had happier citizens, on average."

Then, I took a look at the graph and felt startled and puzzled. I tweeted:

I couldn't see a solid relationship between the variables. In fact, a lot of countries are "more progressive but less happy" than the United States.

A minute later, Stuart Allen sent me the link to the research paper the NYT graph is based on. It showcases a similar scatter plot in which the line of best fit is kept. It also discusses several correlation coefficients the researchers calculated, and they don't look trivial, to say the least (see third screenshot above: r=.41). Isn't the line a critical element here, as it highlights the upward trend? Doesn't dividing the space of the graph in four quadrants make its message murkier, as the positive slope is not that visible? Or is it just me?

