Sunday, April 17, 2016

Visualization against statistical bullshit

I want to bring your attention to this excellent long article by Tim Harford (h/t Thomas Lumley.) Here's a summary:
Statistical bullshit is a special case of bullshit in general, and it appears to be on the rise. This is partly because social media — a natural vector for statements made purely for effect — are also on the rise. On Instagram and Twitter we like to share attention-grabbing graphics, surprising headlines and figures that resonate with how we already see the world. Unfortunately, very few claims are eye-catching, surprising or emotionally resonant because they are true and fair. Statistical bullshit spreads easily these days; all it takes is a click.
Harford suggests that visualization may be a mighty weapon to make statistics attractive and understandable to the public. He mentions Florence Nightingale's famous charts:
(...) There is a middle ground between the statistical bullshitter, who pays no attention to the truth, and William Farr, for whom the truth must be presented without adornment. That middle ground is embodied by the recipient of William Farr’s letter advising dryness. She was the first woman to be elected to the Royal Statistical Society: Florence Nightingale. (...)
The Rose Diagram isn’t a dry presentation of statistical truth. It tells a story. Its structure divides the death toll into two periods — before the sanitary improvements, and after. In doing so, it highlights a sharp break that is less than clear in the raw data. And the Rose Diagram also gently obscures other possible interpretations of the numbers — that, for example, the death toll dropped not because of improved hygiene but because winter was over. The Rose Diagram is a marketing pitch for an idea. The idea was true and vital, and Nightingale’s campaign was successful. One of her biographers, Hugh Small, argues that the Rose Diagram ushered in health improvements that raised life expectancy in the UK by 20 years and saved millions of lives. 
What makes Nightingale’s story so striking is that she was able to see that statistics could be tools and weapons at the same time. She educated herself using the data, before giving it the makeover it required to convince others. Though the Rose Diagram is a long way from “the dryest of all reading”, it is also a long way from bullshit. Florence Nightingale realised that the truth about public health was so vital that it could not simply be recited in a monotone. It needed to sing.
“She educated herself using the data, before giving it the makeover it required to convince others.” This is akin to a message I tried to convey in The Truthful Art: Before we can think of doing good with data, we ought to make sure that our data is as good as possible.