Friday, April 18, 2014

Infographics and visualization bookworm challenge

The European Journalism Centre has released interviews with all the instructors in the upcoming MOOC Doing Journalism With Data. It begins on May 19th, and I think that around 15,000 people have signed up already.

In the videos you'll see me sitting in front of the bookshelves in my office at the University of Miami. Here's a challenge: Take a look at the shelf in the upper-left corner of the frame (the one on my right.) There are around 30 visualization, infographics, and data books in it. How many of them can you identify? Write your guesses in the comments below. At the end of April I'll randomly choose one comment and send a signed copy of The Functional Art and a surprise gift to its author. Try to name as many as possible!

UPDATE: Answers are coming in already, so let's make the challenge more interesting. Let's try to identify books from all shelves visible in the frame.

(I'm shamelessly stealing this idea from Nathan. I don't have nearly as many readers as he does, so let's see what happens.)

Thursday, April 17, 2014

My new infographics motto: "It's more complicated than that"

Seriously. I'm planning to use that motto as a title for future presentations.

Ezra Klein's new online venture, Vox, has just published a series of graphs titled These 15 charts show our health care prices are totally insane. As someone who was born in Spain and enjoyed public health care for 30 years, I can assure you, dear U.S. citizens, that this is accurate. Vox got it right. However, Vox's challenge is that they don't need to persuade me. I've reviewed the evidence and found it compelling. I've been reading this in the past few days, for instance. Health care prices in this country are insane, indeed.

The people Vox needs to convince are those folks who still believe that the U.S. system is the best in the world. And you won't do it by oversimplifying matters. Even if I'm not an expert on the economics of health care, but just a concerned reader, this stack of "cards" struck me as dubious. I felt skeptical right away, so I went to Vox's source. See what the International Federation of Health Plans says about their own data (underlines are mine):



We're comparing averages from a huge U.S. dataset to one private plan from each country. One. In what way is this a fair comparison, exactly? Do we want to send random sampling or proper aggregation down the drain? And I think that I don't need to mention the crucial disclaimer in the last sentence, do I?

Next, the graphs. Notice that Vox's source shows not just averages, but the 95th and the 25th percentiles, at least in the U.S., which is OK. It reveals that there's a lot of variation (side note: I'd love to see some histograms.) For obvious reasons, they cannot do the same with other countries. After all, they're using just one plan to represent each of them!


Finally, and without getting down to the nitty-gritty, take a look at one or two of Vox's charts. Maybe I'm missing something, but what I see is simply a gross comparison of prices. What about if we adjust those prices by GDP per capita and we show them as percentages of it? Here you have the results, quickly calculated in Open Office. Compare my bar graphs to the ones designed by Vox and its source.




Yes, the U.S. health care system is very expensive, but differences don't look that striking now, do they? See New Zealand at the top, for instance.

So, some recommendations if you are a designer, a journalist, or an average Jane or Joe, for that matter: First, never take sources' figures at face value. Second, never begin with an idea for your headline and then look just for data and graphics to support it. Third, remember that stories are always much more complicated than what you have in mind at first. You need to respect nuances, details, complexity, and show them to your readers if you want to persuade them.

Wednesday, April 16, 2014

Annotation, narrative, and storytelling in infographics and visualization

The latest Datastori.es podcast has just been published. As you can see in the photo, Moritz Stefaner, Enrico Bertini, Robert Kosara, and I had a lot of fun for one hour and a half.

We were planning to engage in a heated debate about the glories and shortcomings of storytelling in visualization, but we ended up agreeing with each other a lot. To understand where this all came from, here are some articles and talks that preceded our conversation, and informed it:

• Cole Nussbaumer's series on storytelling techniques 
• Moritz Stefaner's Worlds, not stories
• Periscopic's A Framework for Talking About Data Narration
• Robert Kosara's Stories Are Gateways Into Worlds
• Robert Kosara's Story: A Definition
• Lynn Cherny's Implied Stories (and Data Vis)
• The presentations at Tapestry 2014 and NICAR.

Long story short (no pun intended): Some visualization designers are in favor of storytelling techniques, and others are against them or, at least, against overusing them. And everyone has a personal definition of the term, when applied to information graphics.

In the podcast I explained that I see differences between annotation, narration, and storytelling. After we finished the Skype call, I sent Enrico and Moritz an e-mail to explain myself a bit better. It's copied below (I've made some minor edits.) You may want to read it only after listening to the conversation.
Hi everyone, 
When you publish the podcast, you may want to add an explanation of what I had in mind during the discussion. You can copy and paste this entire e-mail, including this paragraph. I would like to say in advance that all these ideas need a lot of development. I'd love to hear/read some comments on them. 
1. Annotation consists of highlighting certain data points or interesting phenomena in a visualization, and perhaps describing them or putting them in context. For instance, this chart by The New York Times. Notice how the designer provides an explanation for the relevant data points.
Or this interactive visualization on breast cancer rates. See how the information is sequenced in it. 
And why isn't this second graphic a "narration", even if it's organized as a sequence? Keep reading.
2. Narration consists of arranging your charts, maps, and diagrams as a meaningful sequence intended to display cause and effect relationships, no matter how fuzzy they are. This cause-and-effect is the crucial point here. See an example
3. Storytelling: As I mentioned during the podcast, I am growing fond of the definition of story provided by the book The Unpersuadables, by Will Storr: "A story is a description of something happening that contains some form of sensation or drama (...), an explanation of cause and effect that is soaked in emotion."  So, a story in visualization is a narration in which the designer tries to instill an emotional component, rather than relying just on the intrinsic interest of the information presented. 
As an example, I mentioned an infographic about population trends in Brazil that I also described in The Functional Art. Take a look at it
Here, the information is organized in a way that resembles the traditional structure of stories: An opening ("Brazilian population grew between 2000 and 2010,") a surprising fact that becomes a conflict ("but Brazil's fertility rate is way below expected,") the consequences of that conflict ("Brazil's population will start shrinking in 20 years, and it'll become older,") and a conclusion or resolution ("what measures Brazil can adopt to face this future scenario.") All this is based on solid evidence, of course. Storytelling doesn't mean making stuff up. 
We wanted to challenge our readers with this graphic. The emotions we wanted to foster were surprise and, consequently, concern, and curiosity. Most people in Brazil know that its population has been growing healthily in the past, but many don't know that women nowadays have just 1.8 children, which is below the replacement rate, currently at 2.1 children per woman.
Again, I'm aware that this is a very preliminary set of terms, and that the boundaries between them are very blurry. I may change everything completely for my next book. 
Best, 
Alberto

Tuesday, April 15, 2014

In visualization, baselines and negative space matter



The two charts above (source and source) are being discussed on Twitter since yesterday. Andy Kirk has published a good summary of the conversation.

Andy says that if we think that the first graphic works, there's no reason to believe that the second doesn't. I disagree. The first graphic is OK, but the second is misleading, at least at a pre-attentive level.* Why? Because of baselines, gridlines, negative space, and background-foreground relationships.

On the first graphic, the baseline on top is very visible, suggesting that all bars are hanging or falling from it. Thanks to this, the metaphor is clear. Andy himself acknowledges it. On the second graphic, eyes get directed to the bottom baseline, as it is so noticeable, being emphasized by the surrounding white. That baseline looks like the ground on which everything else is sitting. Also, the well-defined black data curve makes the white area stand out over the background. It looks like a snowy mountain range over a blood red sky.

And we could even mention the headline of the story, which includes the verb "rise." There's not a "rise" in the graphic. It shows stuff falling. This makes matters even worse.

Now, about this paragraph:
Everyone’s own reaction is entirely legitimate. Regardless of whether someone is telling you this is the best or the worst visualisation ever made, how you respond to it and how well you draw interpretations from it are entirely for you to resolve. I’m not defending me here, by the way, just saying we all have different responses based on all sorts of factors such as our experience, knowledge of a subject, interest in a subject, taste and graphical literacy.
Well, yes and no. The point here is that, regardless of how knowledgeable you are of information graphics, my guess —let's stress the word "guess"— is that the second graphic makes most readers work hard to get the story. This has little to do with graphical literacy, but with how well we take advantage of the little we know about visual perception and brains. As Jorge Camões wrote: "Cognitive tasks in datavis should complement, not contradict, perceptual ones."

*UPDATE: Steve Harod has sent me a tweet about this: "Great post, but I'd avoid pre-attentive explanation. Attention in figure-ground is complicated. Best to avoid guessing cause."

UPDATE 2: Rob Simmon has shared this video by Andy Cotgreave. It's worth watching.

UPDATE 3: PZ Myers doesn't like Reuters' chart either.

Monday, April 14, 2014

Facebook and Google+ groups about visualization and infographics

As some of you probably know, I'm currently doing a new Intro to Visualization and Infographics MOOC, this time in Portuguese (yes, I'll do it in English again; in the meantime, take a look at this.) What you may not know is that we created social media communities for these courses. They are open to anybody.

In case you're interested, here's the Facebook group, which has 3,260 English-speaking members already, and the Google+ group, with nearly 600. This one is for Portuguese-speaking people only.

Saturday, April 12, 2014

All Tapestry 2014 presentations

All Tapestry 2014 talks are available online already. I encourage you to watch them. You can see my keynote below. It deals with data journalism, storytelling and the future of visualization. To learn more about it, and review the slides, visit this post.

Telemundo moves Nicaragua to the Atacama desert

My UM student Nancy Cermeño, who is from Nicaragua, has sent me the photo below. It seems that Telemundo's journalists and artists could use some Geography classes. Or a good editor.


Tuesday, April 8, 2014

The Jonah Lehrer Effect

Photo just taken in the office of a PhD students at UM. Shall we call this "the Jonah Lehrer effect"?


Utilitarian ethical reasoning in visualization and infographics

Four broad themes dominate my reading list* in the past year, besides the expected ones (visualization, graphics, etc.): Morality, statistics, epistemology, and the core principles of journalism. Taking a look at that list will give you a clue of what I'm thinking and writing about nowadays. I had plans to post something this week about David J. Hand, whose books you should get if you're interested in data and visualization (here's a good interview with him,) but then I opened Moral Tribes, by Joshua Greene.

Last night I went to bed way after midnight because of it. I just couldn't stop reading. The book starts slow. Its first chapters will sound familiar if you've taken a look at Thinking: Fast and Slow and other works on heuristics, cognitive biases, and evolutionary psychology**. But suddenly, around page 100, the book mutates into something different. It becomes a vigorous defense of utilitarianism, which Greene calls "deep pragmatism." I'm quoting from that interview, which doesn't delve into the details of the book, unfortunately:
There is a philosophy that accords with this, and that philosophy has a terrible name; it's known as utilitarianism. The idea behind utilitarianism is that what really matters is the quality of people's lives—people's suffering, people's happiness—how their experience ultimately goes. The other idea is that we should be impartial; it essentially incorporates the Golden Rule. It says that one person's well-being is not ultimately any more important than anybody else's. You put those two ideas together, and what you basically get is a solution—a philosophical solution to the problems of the modern world, which is: Our global philosophy should be that we should try to make the world as happy as possible. But this has a lot of counterintuitive implications, and philosophers have spent the last century going through all of the ways in which this seems to get things wrong.
More info: 1, 2, 3.

Greene's thoughts relate to what Sam Harris said in The Moral Landscape, a book that caused controversy among moral philosophers and scientists alike. Despite the many refutations I've encountered, though, I still find Harris' arguments convincing —not surprisingly, I guess; I've always enjoyed reading Peter Singer and other utilitarians and consequentialists. I can say the same about Greene's.

How is this all connected to visualization and infographics? Well, it has nothing to do with their technical aspects. But it certainly is related to how we think about the ethics of visualization: Is it better to be guided by some core a priori principles ("don't lie," for instance; this is what deontological ethics are about) or by a fuzzy notion of "virtue"? Or should we base our decisions on what we believe —or know— about the consequences of what we do ("don't lie because that will lead people astray")? Remember Mike Monteiro? Watch that talk again, if you have the chance.

* Notice the warning I've included in that page. It applies to most links in this website: "IMPORTANT: All links below are "affiliate" links to Amazon.com. That means that I'm paid a small amount of money for the books you buy after clicking on them. I don't get any cash directly from Amazon, though, but gift cards that I use to buy more books. The average monthly payment I got last year was around $75."

**There are tons of them, besides Thinking, Fast and Slow, and some are really good: The Invisible Gorilla, Incognito, Kludge, Brain Bugs, Why Everyone (Else) Is a Hypocrite, The Unpersuadables (which I finished a couple of days ago,) etc. If you're a journalist or a designer, you should read at least a couple of them. You won't be the same person after you do so.

Monday, April 7, 2014

Weekly resources: Against Big Data, Analytically Speaking Webcast, visualization, and data journalism

Last week I began a series of posts to collect resources related to visualization, infographics, and data. Here you have some more:

'BIG DATA'

• The Parable of Google Flu:  Traps in Big Data Analysis, an article in Science magazine that has unleashed a lively debate. Quoting:
“Big data hubris” is the often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis. (...) There are enormous possibilities in big data. However, quantity of data does not mean that one can ignore foundational issues of measurement and construct validity and reliability and dependencies among data. The core challenge is that most big data that have received popular attention are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis.
Comments here, here, here, and here.

Sunday, April 6, 2014

Why I love teaching infographics

I'll be binge grading and writing in the next few weeks, so don't expect a lot of updates. I'm uploading this post just to share my joy. I love giving students feedback on their work —although I loathe transforming that feedback into a grade,— particularly when it involves praising what they've done. My life gets easier when they do a nice job.

The students in my intro to graphic design (100-level) and intro to infographics and visualization (300-level) courses at the University of Miami have just turned in their latest projects, and I'm delighted with the results. On the left you can see an infographic about newsroom diversity submitted a few minutes ago by infographics student Monica Herndon. Below, a book jacked designed by sophomore Janelle Rodríguez, who's in my intro to graphic design class. Fortunately, these are not outliers. There's still a lot that can be improved and edited (I'm not fond of wrapping text around photos, for instance,) but the average quality of the exercises I've seen so far is pretty good.

Footnote: If you're a professor, you may want to follow the #partyLikeAProfessor hashtag on Twitter.


Friday, April 4, 2014

Adventures in the world of visualization

In the past few days I've been able to squeeze a few hours out of my schedule to play with various visualization tools. I've tried Plot.ly, and I loved it. I hope that the developers behind this tool will keep improving it in the future. Those of us who would want to code just to create silly video games will be very thankful. Next on my list: Lyra and Graf.ly.

I've also come back to R and ggplot2 after an hiatus devoted to D3.js, with the help of Scott Murray's Interactive Data Visualization for the Web and Nick Qi Zhu's Data Visualization With D3.js Cookbook. I'm reading Winston Chang's R Graphics Cookbook and wondering why anyone would choose any other language to analyze data and design graphics (forgive me, Mike Bostock.) R, when paired with ggplot2, is just so powerful and intuitive! Moreover, a library called rCharts seems to let you translate your R graphics to Javascript. I should give that a try, too.

UPDATE: Plot.ly's Matt Sundquist has just told me that they are developing a ggplot2 wrapper. Here's a walk through.

(In case you're curious about the screenshots below, they are related to a little project that involves exploring the quality and attainment levels of public schools in Miami-Dade county. Let's see what I can come up with.)


Thursday, April 3, 2014

Analytically Speaking webcast

If you have nothing better to do tomorrow at 1 pm EST, you may want to sign up for this webcast hosted by JMP. Kaiser Fung of JunkCharts and NumberSense fame and I will be talking about data, visualization, and effective communication.

Tuesday, April 1, 2014

South China Morning Post's rainfall in Hong Kong visualization

Here Comes the Rain: Hong Kong's rainfall patterns since 1997

Today I fell in love with this interactive graphic about rain patterns in Hong Kong designed by Jane Pong. Visit her Tumblr and her portfolio, which combines data visualization with traditional news infographics and illustrations. Jane's graphics are simple, clear, and elegant. It's not the first time they show up here.

Jane works as a visualization designer for the South China Morning Post, the same paper that years ago hired Adolfo Arranz, a big winner in the Malofiej 2014 awards, announced last Friday. Their work is a good source of inspiration for students and professionals alike. Here's their blog.

h/t Nathan Griffiths (Twitter)

Interview about infographics for advertising

Advertology asked me a while ago about the role that infographics can play in advertising and PR. The interview has just been published. It's in Russian, but Google Chrome and Google Translate do a good job at translating it to English. The main takeaway, I guess, is similar to what I said at Tapestry: I don't mind if an infographic or visualization is designed for a news publication, to sell a product, or to promote a company. It needs to be truthful, accurate, and precise, regardless.

(Side notes: I love the Cyrillic alphabet. It's beautiful. Mi name is written "Альберто Каиро" in it. Also, when the photograph was taken —more than two years ago, I think— I was wearing a UNC-Chapel Hill hoodie. Sorry about that, UM colleagues and students*)

*Go Tar Heels!

Monday, March 31, 2014

Infographics students listen sometimes

Every semester I tell my University of Miami infographics students that we don't think just with our brains. We think with our brains, our hands, and the tools that our brains and hands manipulate. Therefore, I encourage them to sketch their ideas out before they even switch the computer on, particularly if their project involves any sort of visual explanation or narrative.*

Of course, some of them don't listen to me. They are students, after all! What does the old fart in front of the projector know? They rush to the software and they end up running into structural and conceptual problems that could have been solved in the early planning stages. When you think, plan, and draw stuff beforehand, ideas can be easily discarded; you don't feel committed to them, as you would if you had spent hours writing code or tracing lines in Illustrator.

But sometimes students do listen. Qin Chen (Twitter) is a former student of mine at UM who was hired by the San Jose Mercury News last year, right after graduation. In her Tumblr, she says that she's been playing with After Effects, a tool that I didn't teach in 2013, but that I plan to include in a new 3D design/infographics class this coming Fall. Qin doesn't illustrate the post with the experimental animation she made, which you can see below, but with the nice hand-drawn sketch that opens this post. I feel proud!

* If the project is a data-driven interactive graphic, I'm fine with sketches done with tools like Google Charts, Datawrapper, or Tableau., though.

Sunday, March 30, 2014

Weekly reading list: Journalism, design, and the dream of a data-savvy citizenry

The first section of my next book will deal with data and science. I'm not trying to do another intro to stats textbook. I'm not qualified for that, and there's plenty of very good ones already. Rather, it's something written in a conversational style that may be useful not only to journalists or infographics designers, but to anyone interested in rational thinking.

I've been reading quite a lot about this in the year and a half since The Functional Art was published. I've also been collecting articles and blog posts to quote from. Last week, during the Malofiej conference, several interesting ones were published, and I decided that, instead of just saving them to Evernote, I'd share them here. This may become a weekly or bi-weekly feature. We'll see.

Thursday, March 27, 2014

First look at the new Malofiej infographics book

As I wrote a couple of days ago, I'm attending the Malofiej infographics summit this week. Every year, the Malofiej organization publishes a book (scroll down here) showcasing the winners of the previous edition's awards. The newest one, the 21st, has just been released. I got my copy a few minutes ago, and it looks fabulous. I'm live tweeting the event, so I don't have time to write a lot about it now. I've just shot some quick photos. The graphic on the cover was designed by Giorgia Lupi, who also wrote one of the chapters.




Tuesday, March 25, 2014

The danger of not adjusting for inflation in infographics

I'm teaching at the Malofiej news graphics workshop and summit this week. It's been fun so far. We have enjoyed a talk by John Grimwade already (see photos at the end of this post,) and I discussed common pitfalls in data graphics. The latest case I included in my slide deck came from Marca, a Spanish sports daily. It was published in August 2013.

The infographic compares the most expensive soccer players in history, up to that date. The headline means "The Hundred Million Euro man." It refers to Gareth Bale, now playing for Real Madrid. The second —from the bottom— is Cristiano Ronaldo, who signed a 96 million euro contract with the team in 2009. Impressive ranking, right? It is. But it's even more impressive that Marca got everything wrong, as these numbers are not adjusted for inflation.

You read that right. How's that even possible in 21st century journalism? Haven't we learned Numbers 101 yet? Not adjusting for inflation is like comparing the number of homicides in New York City (population: 8 million) with the homicides in Tucson, Arizona (population: 524,000.) It wouldn't be fair to do so. It'd be better to use homicide rates (cases per 100,000 people.) Absolute numbers are tricky. In most cases, it's advisable to normalize them somehow before jumping to conclusions.

The Marca folks cannot say they weren't warned against this blunder. Idoia Portilla, a professor at the University of Navarra, told me that a journalism student of hers, who was an intern at Marca at the time, tried to convince some people in the newsroom that their story was not really there. He calculated the real values and came up with the bar graph on the left. The reaction of the seasoned professionals he talked to was appalling: They argued that Marca's readers don't understand "complex" terms such as "adjusted for inflation." I believe that this a bad excuse, at least in part.* The reason why they didn't listen to the intern is that they wouldn't have had a story if they had told the truth.

(*I wrote "in part" because many journalists despise their audiences, thinking that readers are less intelligent and educated than they are. The opposite is usually true, of course.)

According to Malaprensa, an excellent Spanish media criticism blog, the case I described in my talk was not the last time Marca presented data without normalizing them. The graphic below is even worse, as the proportions are all wrong. Just compare 57 with 111,7.


Saturday, March 22, 2014

There's no courage in holding strong convictions

I thought about writing this as an update to my previous post, but then I realized that it deserved its own space. Remember that I finished that piece mentioning strongly held convictions.

Well, I'm sitting in my father's apartment kitchen right now, reading Rebecca Newberger Goldstein's delightful Plato at the Googleplex: Why Philosophy Won't Go Away, and I have just come across a particularly gratifying passage. In one of the many dialogues in the book, Plato chats with a Bill O'Reilly-like fellow called Roy McCoy. When Plato acknowledges that he has modified some of the thoughts expressed in his classic works after evaluating the evidence against them, McCoy snaps:

"You sure you're a philosopher? You seem to be a little too ready to change your mind. Or maybe you just don't have the courage of your convictions."

To which Plato replies:

"I would prefer the courage of my questions."

Enough said.