Monday, February 2, 2015

If something looks wrong in your data it's probably because there's indeed something wrong in your data

Yesterday a revered* Spanish newspaper published a bar chart like the one below in a story about poverty in Latin America. Do you see something weird? Is it really possible that nearly the entire population of Bolivia was poor in 2005?

Of course it isn't. If you go to the data (table below), which comes from the UN's Economic Commission for Latin America and the Caribbean (Cepal), you'll see that for each year there is one column for poverty (“pobreza”) and another one for indigence (“indigencia”). The problem is, obviously, that you cannot add up those two variables. The variable “indigence” is very likely a portion of the broader category “poverty”!

Spain’s traditional newspapers often claim that citizens must pay for their product, and that they deserve special protections because what they offer is far better than what people get from online media, non-professional journalists, bloggers, etc. Blah, blah, blah.

(*Not for long.)


UPDATE: Josu Mezo, from the blog Malaprensa (“Bad Press”) has told me that he got misled himself by the chart the first time he saw it. He didn't notice the mistake. That's precisely the reason why I try not to blame individual designers for this kind of blunder. We all make mistakes all the time, no matter how well we educate ourselves to be more numerate and to pay more attention. This is not an individual failure. It's an institutional one. Newspapers used to have correctors and copy-editors, who took a second, a third, and a fourth look at your work. Most of them have been fired in many news organizations, and these are the consequences, particularly when you're on a tight deadline.