Wednesday, July 2, 2014

The challenges of classification in choropleth maps

Building classes for choropleth maps is always tricky business. By grouping values together as intervals, you always put yourself at the risk of hiding important nuances in the data. There are reliable guidelines you can follow, but the process always requires a good dose of common sense. This excellent article by John Nelson (h/t Rob Simmon and Jorge Camões) explains this challenge really well.

The map below, published today by The New York Times —see it online,— is a good example. Notice that the last class corresponds to the values above 30%. The problem is that this class includes values as big as 89% —or even higher, I didn't check! Perhaps it makes sense to create a fifth class for the counties in which Evangelicals and Mormons are a majority of the population (51%)? Besides, I'm not sure that using equal intervals is the best choice here. But it may be just me. I haven't seen their dataset, after all.


1 comment:

  1. Michael Neutze has tried to send a comment, with no success, so I am posting it myself:

    Alberto,

    always an interesting topic how we colour thematic maps. I would have
    liked to comment on your site directly at

    http://www.thefunctionalart.com/2014/07/classes-in-choropleth-maps.html#comment-form

    but no matter what credentials I use, the comment gets eaten, probably
    because of too many links.


    Here's what I wanted to add to the conversation:

    A longtime solution to classification in choropleth maps was a
    histogram of the value distribution next to a choropleth, together
    with user selectable classification, see e.g.

    http://vis.uell.net/gsvg/electionAtlasGermany.html

    However that sometimes distracted lay people (What does the diagram
    mean, which classification is right?).

    Mike Bostock showcased a map key that borrows from the boxplot (you
    can judge the extent of the open ended lowest and highest class) and
    has a very high data-ink ratio (works with minimal labeling):

    http://bl.ocks.org/mbostock/5144735

    As for the classification, Jenks Natural Breaks (minimise variance
    within groups, maximise between them) has proven very compelling. Here
    is a nice JavaScript implementation

    http://www.macwright.org/2013/02/18/literate-jenks.html

    I used a combination of the two to map census data for Germany, see

    https://www.destatis.de/zensuskarte/index.html

    That choropleth can also be morphed into a bar chart ("Top 10" and Bar
    Chart Button in the upper right) and should tackle most challenges of
    classification in choropleth maps.

    Best,

    Michael Neutze

    ReplyDelete