Friday, June 7, 2019

The ambiguity of dot density maps

Steve MacLaughlin sent me the following dot density map of lightning fatalities per state; it appears in the Wikipedia page about lightning strikes.

I've always had mixed feelings about dot density maps because I find them ambiguous, and my guess is that they confuse many readers. Dots are often used in graphs, charts, and maps to accurately locate individual observations and phenomena, but that's not the case here. If you read a dot density map that way, it'll look like there were fatalities everywhere in Florida, and that lightning strikes become much less deadly as soon as you cross the border with Georgia or Alabama.

In a dot density map, though, each dot represents one observation, but dots aren't located where those observations were made; instead, dots are distributed to maximize coverage and, if the placement algorithm is well designed and manually tweaked, it'll avoid absurd placement —such as dots over lakes, rivers, or unpopulated regions.

What are the alternatives? I wouldn't recommend a choropleth map, as it's appropriate only when data is standardized —rates per 100,000 people for instance,— not when we visualize raw counts. Maybe a proportional symbol map would be the right solution. Or, if revealing geographic patterns isn't the goal of the visualization, a simple bar graph sorting from highest to lowest values would do.

(Another question about the map above is why states like Nevada, Idaho, or Nebraska are empty; that can be due to errors and inconsistencies in data gathering and access, or to other glitches, such as the fact that lightning deaths are rare, and therefore a decade is a relatively short time period. This older map of deaths between 1959 and 2014 shows just 18 fatalities in Nevada.)

Update: here are two other maps, one adjusted by state population.