An Introduction to Slopegraphs - Part 1

Tim Brock / Wednesday, June 3, 2015

When discussing the origin of the slopegraph, the go-to reference seems to be Edward Tufte's "The Visual Display of Quantitative Information". On page 159 (of the second edition) he introduces a chart or "table-graphic" that "when read vertically, ranks 15 countries by government tax collections in 1970 and again in 1979, with names spaced in proportion to the percentages". In addition, each pair of names is connected by a single straight line. This is a slopegraph.

Rather than just redraw this graphic (you can find it here), the chart below uses the same principles but a different dataset: populations (in millions) of thirteen of the nineteen G-20 countries in 1960 and 2013, obtained from the World Bank website. I've also added a horizontal line to indicate the baseline of the scale used.

The elegance of the slopegraph comes from its simplicity - (usually) two columns of structured data and clear links between them, indicating changes and emphasizing discrepancies. In Tufte's original example, Britain stands out as the only listed nation whose receipts decreased. In the data above the lines show that the populations of all countries increased between 1960 and 2013, but some (e.g. Mexico) increased by a much greater amount than others (e.g. Germany).

I frequently use slopegraphs but they do have their limitations. Perhaps the most notable of these is that you will frequently find labels overlapping. In fact that's why the chart above doesn't display the other six G-20 nations. Five of the six omitted countries (Brazil, China, India, Indonesia and the USA) had a population in 2013 of 200 million or more. Including all these would have meant squashing all the other countries in to the bottom ~10% of the figure. But this is a problem that most chart types will struggle to deal with well, not just slopegraphs. The sixth omitted country, South Africa, had a population of "just" 53 million in 2013. In 1960 it was only 17 million and it is this that creates a problem, with its label overlapping with that of Canada. I'll look at dealing with overlapping labels in Part 2, and concentrate on the visual (rather than verbal) display of data in the rest of this article.

Alongside the slopegraph, Tufte is associated with a push for more data-rich graphics, exemplified by his invention of the sparkline. A slopegraph with just two data points for each country or other entity seems somewhat at odds with this philosophy. So what happens if we plot all the World Bank data on population size for the 13 countries?

When we plot a point for every year from 1960 to 2013 we get a completely different picture of the changes in population that have taken place. While the line for Mexico might not look massively different, for Japan and Russia things are far from linear. Japan's population growth rate started out high but shrank and is now slightly negative. And, roughly speaking, Russia's population grew, reached a maximum, started to decline, declined at a greater rate, levelled off and - in recent years - has started to rise once more! The simple slopegraph joining the two endpoints doesn't pick up any of that rich variability.

As is, the last chart has a problem of it's own that wasn't present in the original. Specifically, crossings can make following some curves from one side to the other rather difficult. For example, if I start off at the UK on the left and try to follow the curve I keep finding myself at Italy on the right-hand-side. If I follow Italy from the left-hand-side I still end up at Italy though. This is (I assume) a result of the Gestalt principle of continuity: "we are more likely to construct visual entities out of visual elements that are smooth and continuous, rather than ones that contain abrupt changes in direction" (Colin Ware, Information Visualization (Third Edition), page 183). In other words, in both cases after the lines meet, my brain chooses to follow the flatter line of Italy across rather than the UK's line when it diverges upwards. We can overcome this tendency by making the lines different colors as below. While multiple lines share the same color, there is no specific attribute they have in common - this is purely a matter of perceptual convenience.

I don't think this last chart is as elegant as the one at the top, but in this case the much finer detail it conveys more than makes up for that.