Saturday, March 28, 2020

Rates of change are tricky

Let me begin by saying that (a) we should all appreciate the effort that so many journalists are making to keep the public informed about the coronavirus pandemic; shifts of 10, 12, 14 hours and more are common (subscribe to your favorite news publications, people, be responsible!) (b) Commenting on graphics is easier than making those graphics. As a designer myself, I know how hard it is to navigate the many challenges and trade-offs visualization poses.

This said, I often ponder how we can make visualizations more approachable and understandable. Take the following graph from this New York Times story:

The vertical position of the points on the line represents the percentage change of confirmed cases over the previous 7 days. There are other ways to show change—think of bars with arrowheads pointing up,—but they are clunkier. This graph, if you know how to read it, works fine: the goal is to bring those lines down to the +0% baseline, or close. This point is explained in the body of the story.

However, imagine the following realistic scenario: someone takes a screenshot of this graph and publishes it in social media, adding some personal comments, or wild inferences (1, 2). I wonder whether graphs like this, when isolated from what surrounded them originally, might make some readers reach dubious conclusions or feel too optimistic and confident (“most lines are going down! You're all overreacting! Time to stop worrying and go back to work!”)

Those readers would be missing a crucial point: a 33% increase (line is low) is, in general, better than a 80% one (line is high) indeed, but we need to know more. Prior conditions matter. At the beginning of the chart, the curves are pretty high probably because those are the early stages of each outbreak; few cases were detected. If a city begins with 10 confirmed cases, and later detects 8 more, for a total of 18, it has an 80% increase.

But if we already have many confirmed cases, for instance 1,000, and later we end up with 1,300, we've experienced an increase of +33%. It is better to have +33% than +80%, as it might mean* we're stretching the time it takes cumulative confirmed cases to double or triple—we're flattening the curve, as we say these days—but readers shouldn't ignore other facts. Even a “tiny” 10% increase, if experienced when you're already dealing with tens of thousands of infections, may be catastrophic: hospitals could be even more overwhelmed, leading to more deaths. Think of the situation in Lombardy.

The NYT story contains another graphic comparing rates of change with confirmed cases per thousand people but, as the Times journalists themselves acknowledge, it's “hard to read”:

What to do? It's tricky. Maybe to show more and explain more, as I've suggested before? The New York Times is doing a good job. The body of the story thoroughly explains the pros and cons of these graphics, what they show and what they don't show.

What I fear, though, is that it's too easy to read charts like these while ignoring their footnotes, or to detach the charts from their context. I wonder whether we should produce animated explanations or have presenters explain our visualizations more often, so readers won't be able to separate visuals from their context and annotations. Mediators play an important role.

(* I wrote “might” because confirmed cases aren't total cases. In the U.S. at least, these charts might be showing, at least in part, the increasing availability of testing. Also, in this pet example I'm not considering other factors, such as the number of recoveries.)