Friday, May 27, 2016

Visualizing chess

Ootro Estudio is a firm based in Alicante, Spain, that offers graphic and furniture design and 3D services. Their latest project is an alluring piece of data art titled Arbor Ludi, which portraits the game tree of eight top chess players from the last century. Here's a description:
The selected players are José Raúl Capablanca, Mikhail Tal, Tigran Petrosian, Bobby Fischer, Anatoly Karpov, Garry Kasparov, Viswanathan Anand and current champion, Magnus Carslen. We chose players with different styles, so their game trees would display the contrast between them. 
To generate the game tree of each champion we used a database of more than 10,000 games. The number of games changes significantly for each player: Capablanca was the less active (596 games) and Karpov was the most active (3,374 games.) This difference in the number of games affects the final result of the representations. 
In order to transform the data into a tree shaped object we used an algorithm programmed with parametric design tools. This algorithm deciphers the topological diagram of all games of each player while it builds a three-dimensional tree whose growth reproduce the said diagram. 
Following this topological diagram, the tree starts with a trunk that represents the total amount of games of each player. From it, the principal branches of the tree emerge, and each one correspond to the first move of all the games, being the thick of each branch proportional to the number of times this moves have been done. In each node, it appears written the move corresponding to the previous branch. The criterion recurs with the next moves, so that new branches emerge which are the result of the different paths that the players have taken in all their games. Each step adds a level of complexity to the topological diagram.
A limitation of the project is that branches that represent identical moves are not positioned identically on the different trees. Therefore, comparisons between players are hard, if not impossible, even if you zoom in to read the labels. Regardless, this is a quite impressive effort, and it'd certainly look beautiful if framed and hanged on your office walls!

Thursday, May 26, 2016

That time when I made readers click 50 times in an infographic

Robert Kosara has written a nice rant (don't miss the comments) against scrollytelling in visualization. He's an advocate for steppers, those graphics that divide the information into sequential screens which can be navigated through numbered or labeled buttons.

During a conversation on Twitter, Knight Foundation's Shazna Resna said that she likes scrolling when using a mobile device and step-by-step graphics when on a computer. I agree with that, but we're moving toward a mobile-first world —if we're not there already,— aren't we? And scrollable visualizations can be done really well; here's some advice from Mike Bostock.

Anyway, this debate reminded me of this monster of an infographic (Flash warning! This won't work on an Apple tablet or phone!) that I designed in 2003. I'm still fond of the 3D models and the vector animations, but I certainly don't think that it's a good idea to make people click more than 50 times to get to the end of the presentation!

We all have a past, I guess. Here you have some screenshots of that awful thing:

Wednesday, May 25, 2016

Simulating the lives of hunter-gatherers with animation and visualization (and good humor)

Simulpast is a large interdisciplinary project organized by the Barcelona Supercomputing Center (BCS) intended to model past human behavior. Its latest effort is Simulados, a simulation of the lives of prehistoric hunter-gatherers in the region of modern Gujarat, India. From the technical description:
This region of India has a strong seasonality and one of the most unpredictable climates in theworld. The main goal is to build an agent based model (ABM), through which we can studythe management of resources and the decision making process of hunter­gatherer groups thatinhabited the region between 10000 BC and 2000 BC. We are interested in analyzing theircapacity for resilience to the extreme variability of the environment, as well as theirinteraction with agro­pastoral groups.
The project is based on a tool called Pandora:
Pandora is an agent based modeling tool designed to run complex simulation models in a high performance environment. The agents represent individuals or groups of people (a family in the case of Simulados) with a complex artificial intelligence algorithm that gives them power to make their own choices and act upon them, thereby interacting between them and with the environment in a totally autonomous way. Pandora is able to simulate millions of agents in large and detailed terrains.
If this sounds too geeky, watch this well-paced and fun animation combining 3D characters, charts, and maps that the BCS put together. It explains the science behind the project quite nicely. This video may inspire those struggling to explain complex ideas to the general public.

H/T Fernando Cucchietti

Tuesday, May 24, 2016

Stacked bar graphs and small multiples

Stacked bar graphs are tricky, particularly when you design more than one and you arrange them in a sequence: Only the bottom and upper portions are comparable to each other, as they sit on common baselines. However, there are cases when this graphic form is appropriate. See this elegant small multiple array just published by The New York Times.

What matters in this graphic is not to compare all parties, but to emphasize the hard-right ones, and then to compare them to all other parties as a whole. Therefore, I think that the decision of coloring all center-right and center-left parties identically makes sense: It's red versus white and gray.

Sunday, May 22, 2016

Hiram Henriquez's "The Importance of Explanatory Infographics in Journalism"

This year I'm busy with my PhD dissertation (see its title here,) so expect some posts about sources that I'm planning to quote. The first one is Hiram Henríquez's “The Importance of Explanatory Infographics in Journalism” (PDF.) Hiram is a colleague of mine at the School of Communication of the University of Miami, where he landed after a long career in the news.

The aforementioned document is the thesis Hiram wrote for his MFA at Savannah College of Art and Design, and it's worth your time. It describes the demise of the traditional news graphics department, and the rise of web visualization and news animation. Its tone is somber, as Hiram believes that the disappearance of the large print infographics that newspapers embraced in the 80s and the 90s is a negative phenomenon.

As you'll see when I make my own dissertation public, likely by March 2017, my view of the changes news infographics has experienced in the past decade is more optimistic — no matter how much I love big graphics myself!

Friday, May 20, 2016

Playful visualization

A visualization that shows data and reveals some patterns or trends can be good; a visualization that lets you interact with the data may be better; but a visualization that transforms the interaction into a playful, game-like experience usually has a greater potential of engaging readers.

Just consider this project by FiveThirtyEight. Beginning with the title, which changes every few seconds, the whole package is delightful. It begins with a game in which you can manipulate the turnout and the vote of five groups, and it continues with a series of scatter plots and a large table which describe how those groups behaved in previous elections.

A project like this may appeal mainly to wonks, but I have the hunch that this is exactly the audience its designers had in mind. Oh, and it works well on mobile.

Wednesday, May 18, 2016

The Financial Times launches The Chart Doctor, a column about visualization and infographics

The Financial Times, known for its elegant charts and infographics, has just launched a new section called “The Chart Doctor”. Its first article discusses the widespread and wrongheaded idea that any visualization should be understood in just five seconds —if possible, yes, sure; but it's rarely possible.

A few days ago I wrote against that idea: Simplicity is a virtue in visualization, but complexity isn't a vice when a complex graphic is necessary to tell a complex story with adequate depth. As the old saying goes, everything should be made as simple as possible, but not simpler.

As I wrote in The Truthful Art, quoting Nigel Holmes, the goal of information graphics shouldn't be simplification, but clarification, which is similar to what John Maeda's classic book calls simplicity. Clarification often involves increasing the amount of information shown, not reducing it mindlessly. And when an unusual graphic form may be much more enlightening than a traditional one, we ought to give it a try, not just take refugee in the self-defeating and lazy “our reader won't understand this.”

h/t Alan Smith

Sunday, May 15, 2016

Magnificent print visualizations

Yeah, I know that mobile-first is the way to go nowadays (I'm planning to make my classes mobile-first starting next semester, after all,) but I believe that, by avoiding print publications, too many people are unaware of how impressive large static graphics can be.

Take the one below. I just saw it in The New York Times during breakfast this morning. It's splendid, isn't it? Its online counterpart accomplishes the same goals: showing change in time, how far records stand in comparison to the other 50 fastest times each year, how long each world record remained unsurpassed, etc. But the digital version, if seen on a screen smaller than 21'', pales in comparison to the print one in one crucial aspect: Magnificence.

(This post is just a platitude from an old man yelling at clouds, of course. See: 1, 2)

Friday, May 13, 2016

We have an hymn: Graphin' the lines, and scatter the plots, pivot tables hallelujah!

Some fun for the weekend: this amazing parodic song, which should soon become the hymn of data journalism, data visualization, and news infographics. It's catchy as hell:

h/t Sisi Wei and Lynn Cherny

"Our reader" won't understand something as complicated as that!

The other day I received a gift from Luís Melgar, one of our infographics/data visualization Masters students at the University of Miami (here's our other program). Luís has just graduated, and you should expect a post about his final project soon, as it's an amazing data-driven story. He gave me a printout of this 1870 treemap.

I came back home in the evening and my 10 year old boy got curious about the treemap. He tried to decipher it on his own for a minute, and he couldn't. Then I explained to him that the segments of each square are proportional to percentages, and pointed out a couple of states to illustrate this principle. This took me around thirty seconds.

After that, we spent nearly half an hour together finding the most interesting cases, the outliers, and figuring out which states didn't exist at the time. My kid said: “Numbers are boring, and designing graphics like this must be boring too, but reading it is a lot of fun.” I disagree with him on the first two statements, but agree on the third.

Now, my point: How many times have we all faced the common complaint “our reader won't understand something as complicated as that”? This is a fallacy I spent quite a lot of pages debunking in The Truthful Art*, and the conversation with my kid should be a good argument for the future: If you believe that a substantial portion of your readers (your “average reader” is a mythical creature) won't understand an unusual graphic form, don't dismiss it outright. That's a self-defeating strategy. It'll lead you to stick just to bar charts, time-series charts, and univariate data maps, when other kinds of graphics may be more illuminating —histograms, strip plots, or slope charts, anyone?

Visualization has a grammar and a vocabulary, which can be taught and learned. Scatter plots were unusual in the media just a decade ago, but today it seems that an increasing amount of non-specialists understand them. Why? Perhaps thanks to some pioneers in the news, who decided to start using them years ago, adding captions explaining how to read them. These designers and journalists assumed that most readers may indeed not be able to read those charts at first, but that they are not stupid, and can learn. That's the key: Respect people's intelligence.

Finally, remind your boss that even the all-too-common time-series line chart was “unusual” and “difficult to understand” in the past, but that didn't stop the journalists this post by Scott Klein talks about. They just added a long caption to the graphic. That's the equivalent of my thirty seconds of explanation.

(*There's another more damaging fallacy that I also addressed in the book: “A reader should be able to understand a graphic in five seconds.” Well, no. It depends. Some graphics need some effort, the same way that it takes time to extract meaningful information from a written story, beyond its headline.)

Thursday, May 12, 2016

Visualization's expanding vocabulary

This morning I received a long e-mail by Nick Cox, a lecturer at the University of Durham. He sent me many detailed and thoughtful comments on The Truthful Art. My reply was that if we had met before, I'd have asked him to be a technical reviewer, along the ones I mention in the Acknowledgments section (Stephen Few, Heather Krause, Diego Kuonen, Jerzy Wieczorek, etc.) Nick has also kindly written a positive —but critical— review in

Anyway, the interchange we had over e-mail led me to check Nick's articles in the Stata Journal. He has written extensively about visualization in statistics, and plenty of his papers can be accessed for free*. Here's an example; it describes mosaic plots and spineplots, graphic forms that I don't cover in my books. It reminded me of how large the vocabulary of visualization already is, and how fast it can expand in the future if designers of all backgrounds keep imagining new ways of displaying information.

* Those that were published three years ago or more. See the complete list.

Sunday, May 8, 2016

Beeswarm plots to show frequency distributions

The Truthful Art has a couple of chapters discussing the virtues and shortcomings of different frequency charts, such as the histogram, the box plot, the violin plot, and even the strip plot. Gerardo Furtado has just reminded me of another one, the beeswarm plot*, which is similar to the strip plot, but disposes of dot overlap. See this pretty example, and play with it. Aren't d3.js's animated transitions amazing?

(*Yes, there's an R package to do beeswarm plots, too.)

UPDATE: Andres Snitcofsky has asked if circles of the same color should be clustered together. The answer is yes, if it's possible.

The Guardian puts flow charts on a map

Alluvial diagramsSankey diagrams, and similar graphic forms can be nice alternatives to mere stacked area charts, which represent just parts of a whole, but not ranking. The Guardian has just published an interesting experiment, in a small multiple array of flow charts is shaped as the map of Scotland.

I feel like the charts on the main map could have been bigger, but I like the results anyway, as the graphic reveals the downward trend of the Labour Party, which is the focus of the story. I also appreciate the smaller graphics highlighting portions of the data.

(An aside: I keep learning R, and I have found instructions on how to do alluvial diagrams with the MASS package. How neat is that?)

Via Cath Levett.

Saturday, April 30, 2016

OneZoom: The tree of life in a massive interactive visualization

OneZoom is a recently launched a large-scale node visualization of the tree of life based on these previous efforts. I saw it announced in Jerry Coyne's blog about evolution and I immediately began playing with it.

Even if the project is still incomplete, it already shows a vast amount of information. Just open it up and start zooming and zooming, or go back to the beginning and click on one of the species. Or follow this link to zoom in the tree and reach the modern lion. You can also search for any species you want.

The following passage comes from the official description of the project:

“The OneZoom software allows you to explore the tree of life in a completely new way: it's like a map, everything is on one page, all you have to do is zoom in and out. We hope you have fun exploring the OneZoom tree of life - we've certainly had fun developing it. Even after thousands of hours working on it, we are still frequently astonished at what it reveals about the world around us. The OneZoom software uses fractals to condense the entire tree of life into a single, zoomable page. OneZoom is so named because all the information is on a single page: all you have to do is zoom to reveal details.”

Thursday, April 28, 2016

The Washington Post does it again

Perhaps encouraged by their Pulitzer Prize, the visuals folks at The Washington Post keep producing some of the finest news visualizations out there. The latest is this series about the housing market, which includes an assortment of charts and interactive maps, combines them with text quite effectively, and looks good on mobile. I've just added it to the list of projects I show in classes every semester.

I obviously searched for my Zip code and witnessed the story change:

Stories about certain markets will be launched in the next seven or eight days. The first one, already online, is about the San Francisco Area.

Sunday, April 24, 2016

From scatter plot to slope chart

Gerardo Furtado, a Brazilian teacher who studies biology and evolutionary science, and writes about them, has designed a fun interactive graphic based on the cover of The Functional Art. Maybe this will help me overcome my notorious frustration with d3.js —but only after I finish learning proper data manipulation with dplyr; I am using the amazing DataCamp for that.

Here's an animated GIF of Gerardo's interactive graphic:

Thursday, April 21, 2016

Visualizing Shakespeare's sonnets

Gramener, a data visualization firm (follow them on Twitter), has recently launched a very simple but quite effective tool to visualize Shakespeare's sonnets. I'm a fan of visualizations that are not overdone, the kind that makes you feel that designers were trying too hard to impress their readers with their technical acumen. This is not one of those. It's straightforward, readable, and works decently on mobile. Here's how many times “love” appears in the poems:

And this is what happens if you explore the graphic a little bit further, by clicking on any of the sonnets:

Wednesday, April 20, 2016

Narration and exploration in visualization

What should we emphasize when designing a visualization? Should we explain the data, perhaps through a narration, or should we let readers explore the data at will? Those are questions you probably get regularly if you work in this field.

The answer is quite obvious: if your graphic is a digital and interactive one, why shouldn't you combine narration with exploration? To see some good examples of hybrid visualizations, I'd encourage you to take a look at Kiln, a company that designed graphics for the recent Panama Papers investigation. I like their Ship Map and their Digital Divide projects, which bring to mind Hans Rosling's style.

Tuesday, April 19, 2016

It's time for a Pulitzer Prize for infographics and data visualization

Winning piece by The Washington Post
The 2016 Pulitzer Prizes have been announced, and there are good news for those who care about visualization. The Tampa Bay Times won one for this series of stories and graphics I praised a while ago; The Washington Post got an award mainly for this interactive graphic. Other organizations that use data journalism on a regular basis, like The Marshall Project and ProPublica, were also recognized by the jury.

All this makes me very happy. The Post piece, for instance, won the National Reporting prize, which is great. However, I wonder: Isn't it time for the Pulitzer board to create a category for infographics, data visualization, news applications, etc.? Awards already exist for other journalistic forms like explanatory reporting, feature writing, commentary, photography etc. Isn't it clear already that visualization is increasingly popular, effective, and that graphics can be standalone journalistic “stories”? Doesn't visualization deserve to be recognized as a distinct way of delivering news? I believe that it does. Moreover, as Scott Klein says, news graphics have more than two hundreds years of history, so this prize category is long overdue.

Perhaps we should open a petition.

UPDATE: Here's IndyStar's Stephen Beard on Facebook:

Sunday, April 17, 2016

Visualization against statistical bullshit

I want to bring your attention to this excellent long article by Tim Harford (h/t Thomas Lumley.) Here's a summary:
Statistical bullshit is a special case of bullshit in general, and it appears to be on the rise. This is partly because social media — a natural vector for statements made purely for effect — are also on the rise. On Instagram and Twitter we like to share attention-grabbing graphics, surprising headlines and figures that resonate with how we already see the world. Unfortunately, very few claims are eye-catching, surprising or emotionally resonant because they are true and fair. Statistical bullshit spreads easily these days; all it takes is a click.
Harford suggests that visualization may be a mighty weapon to make statistics attractive and understandable to the public. He mentions Florence Nightingale's famous charts:
(...) There is a middle ground between the statistical bullshitter, who pays no attention to the truth, and William Farr, for whom the truth must be presented without adornment. That middle ground is embodied by the recipient of William Farr’s letter advising dryness. She was the first woman to be elected to the Royal Statistical Society: Florence Nightingale. (...)
The Rose Diagram isn’t a dry presentation of statistical truth. It tells a story. Its structure divides the death toll into two periods — before the sanitary improvements, and after. In doing so, it highlights a sharp break that is less than clear in the raw data. And the Rose Diagram also gently obscures other possible interpretations of the numbers — that, for example, the death toll dropped not because of improved hygiene but because winter was over. The Rose Diagram is a marketing pitch for an idea. The idea was true and vital, and Nightingale’s campaign was successful. One of her biographers, Hugh Small, argues that the Rose Diagram ushered in health improvements that raised life expectancy in the UK by 20 years and saved millions of lives. 
What makes Nightingale’s story so striking is that she was able to see that statistics could be tools and weapons at the same time. She educated herself using the data, before giving it the makeover it required to convince others. Though the Rose Diagram is a long way from “the dryest of all reading”, it is also a long way from bullshit. Florence Nightingale realised that the truth about public health was so vital that it could not simply be recited in a monotone. It needed to sing.
“She educated herself using the data, before giving it the makeover it required to convince others.” This is akin to a message I tried to convey in The Truthful Art: Before we can think of doing good with data, we ought to make sure that our data is as good as possible.