Thursday, May 7, 2020

The problem with inconsistent and unlabeled scales

One of the strategies to come up with novel ways to display data is to combine existing graphic forms. This morning The New York Times published a story titled 'Most States That Are Reopening Fail to Meet White House Guidelines' that contains a series of square equal area cartograms that are, at the same time, trellis charts. The piece is really nice.

There's something about it that worries me a bit, though: charts don't have scales. Removing scales from graphics seems to be getting more popular lately among data journalists, and it works in some cases here: some of these charts have horizontal reference lines—see animation on the right—that help you get a sense of proportion and variation.

But the following set of line charts lacks any reference and, moreover, it seems that each one is based on a different scale: New York has more daily confirmed cases than Florida—thousands versus hundreds—but the last point on Florida's line is higher than New York's. New Hampshire has a 7-day average of around 100 cases; Maine has a bit more than 20. I understand that the goal of these graphics is to reveal upward and downward trends, not the case count itself, but I fear that this design choice may mislead some readers:

Here are those charts with scales:

What could be an alternative here, I wonder? It's tricky. There might not be an ideal solution, as it often happens in visualization; adding detailed labels would clutter these tiny charts. Perhaps not to show daily new confirmed cases, but some sort of index—percentage change based on a common starting point for all states,—or the variation in comparison to the previous day or week?