Friday, January 11, 2013

Journalists, designers, statistics, and the scientific method

This coming Monday I'm participating in a roundtable at Bloomberg BusinessWeek's first design conference, in San Francisco. I'll speak at Facebook on Tuesday. In both talks, I plan to ask scientists and engineers to learn a bit about storytelling, design, and journalism (maybe by reading books like this one), and ask journalists and designers to learn about science and statistics. As a journalist and a designer, I know what I'm talking about: On average, we —me and my colleagues— are clueless, as this article in The Guardian points out.

I am not alone in identifying this challenge: As I've mentioned before, every once in a while I get an e-mail asking for books and other resources to undertake a self-teaching program. For those with just a crude understanding of science, I usually suggest David Deutsch's The Beginning of Infinity, which includes one of the best intros to the scientific method I've ever read, and Ben Goldacre's hysterical Bad Science. They are fun and well written.

What about statistics? There are plenty of good books out there (and MOOCs) but not all of them are equally readable. I'd love to tell you to purchase Wainer's Picturing the Uncertain World or Abelson's Statistics as Principled Argument immediately, as I love both (read this and this). However, I'd be doing you a disservice if you don't know what a p-value or a z-score are beforehand. So how can you get started if you are afraid of mouthfuls like "central limit theorem", or if you still think that statistics is just about Math? (It's not about Math; not nearly. It's about disciplined reasoning.)

In the past few days I've been reading Naked Statistics, by Charles Wheelan, of Naked Economics fame. It's a concise and gentle introduction that uses plenty of examples to explain the core concepts in the field. There are other intro-level popular books out there, but I must admit I'm impressed with Wheelan's. If I had to teach a data driven journalism course, this would be on top of the bibliography.

But Naked Statistics is not enough. It's true that it'll help you avoid the most common mistakes we journalists and designers usually make when dealing with numbers and probability, but it doesn't go deeply into the specifics. Therefore, here's another suggestion for you: Right after Wheelan's, study Urdan's Statistics in Plain English and Field's Discovering Statistics Using R in parallel. Field's is a massive, 1,000-page tome, but the author's personality (he's a hardcore Iron Maiden fan with a quirky fondness for inappropriate jokes) makes it very enjoyable. Urdan's is a great companion, as it works as an annotated table of contents.

If you have other suggestions, please write about them in the comments below.