Pie charts from Statistical Breviary by William Playfair, published in 1801.
My automatic reaction to the words “data visualization” tends to be an eye roll. Infographics and data maps—now the darlings of designers and journalists alike—have enjoyed a surge in popularity over the past several years, that, in my opinion, has been hugely disproportionate to the function they have served.
Don’t get me wrong—I enjoy and drool over the aesthetic and intellectual experience of a stunningly produced infographic as much as any other geek. But I have seen very few examples of data visualization that give any evidence of its power to change the world, at least any more than any other type of visual or textual narrative communication or journalism. My thinking has been that, if functionality is to be measured by actual usefulness to produce tangible change or discourse in society, data visualization leaves a lot to be desired.
However, a recent conversation with Dietmar Offenhuber, the MIT Sensible City Lab Ph.D. student who has been working with Lab Team member Carlo Ratti on his programs here at the Lab, convinced me that, while I’m not wrong in being thoroughly unimpressed by the current, trendy uses of data visualization, it may, indeed, hold the potential to be a more useful tool than I have so far believed.
First, a little background: data visualization itself is nothing new. I highly encourage any designer who thinks they’re doing something cutting-edge with graphic representation to take a look at William Playfair’s Statistical Breviary— the first known pie chart, created in 1801 (above)—or John Snow’s famous 1854 map connecting London’s pattern of cholera outbreak to proximity to public water pumps (below). Though their hand-rendered, analog aesthetics may be less sexy than what we see from today’s infographic fetishists, functionally they serve more or less the same purpose: to communicate information and connections through visuals rather than numbers or words.
Map of cholera cases in London, created by John Snow in 1854.
Of course, today this is unarguably much easier to achieve than it ever has been, and thus its persuasive power is admittedly more democratized. Not only do we have an unprecedented deluge of computer-crunchable data available to us as a result of the open data revolution currently underway, but we also have powerful and widely available visualization tools through which to run that data. Most computer-literate people today could plop data into a visualization program and spit out a relatively informative and aesthetically appealing infographic or data map.
The problem, however, is that most of it stops there. The visualization itself is the end goal. It illustrates a point, and that’s where its function ends. “When we talk about whether information visualization can really change things, I think there are two very different, important aspects of it,” explains Offenhuber. “On the one hand it is about communication—convincing people of things. The explosion of infographics happening right now is all about communication, to the point that it’s a little bit too much for my personal taste. It’s just a literal translation of a textual narrative. But, on the other hand, it is about using visualization for exploration and analyzing data. The possibility to explore the data is something very different.”
And in order to do that, there is one critical element that most visualizations are lacking: the actual data that went into producing them—or, as Offenhuber puts it, the separation of information and layout.
Consider, for example, the earlier days of the Internet, when any type of content on a web page was deeply embedded in its HTML code, and thus more or less inaccessible to the average person. Compare that to today, where one can simply copy and paste the embed code of, say, a YouTube video, and immediately and seamlessly shift that content to display it in a completely different context.
“The same quality is the most powerful aspect of data visualization and data journalism or whatever you want to call it,” Offenhuber says. “You have a toolbox of different representational methods, and you have data sets that can be processed with these tools.” In other words, the data set that produces a visualization is essentially the embed code of information analysis. It turns a visualization into an actionable tool.
Take Safecast, for instance, a project that started after the March, 2011 meltdown of the Fukushima reactor in Japan. Safecast collects and maps data on radiation levels, but also publishes that data openly for anyone to use.
If all Safecast did was produce a static map from the data they collected, it would still be extremely informative and useful. People could look at the map and determine where they should or should not live in the country, for instance.
But because the data is published along with the visualization of it, it allows anyone to take the information they garner from the map one step further. If someone sees the varying levels of radiation and wonders how they overlap with, say, cancer rates, that person can extract the data and overlay it with health statistics. Or, he or she could pull out the actual numbers to pressure the government into, say, concentrating cleanup or relief efforts more strongly in the areas with higher levels. Or that data could be used for any number of other purposes. The possibilities are as endless as our own curiosity and will to use them, but only if the data is there for us to explore that curiosity with.
This is one of the big differences between the infographics sections of, say, The New York Times, the Texas Tribune, or any number of other publications that have recently developed a focus on data visualization, and the Guardian’s Data Blog, for instance.
The Times may do a stunning and powerful analysis of New York’s shifting geographic distribution of ethnicities and prove the point that, say, black populations are getting pushed to the city outskirts and being replaced by whites. But if somebody wants to overlay that map with another type of data to explore a different aspect or consequence of that shift, or even perhaps dispute it by suggesting the mapmaker has ignored an essential outlier, they cannot do that, because the data is not provided along with the graphic. It is a one-way conversation.
On Data Blog, on the other hand, every graphic is accompanied by the downloadable data set that went into creating it, thus offering the opportunity for a more collective discussion. “Suddenly, it’s a completely different offer that you’re making to your readers, and I think that’s a consequential step,” Offenhuber says. “This means that the data visualization kind of becomes the medium for having this discourse. Suddenly, visualization becomes a facilitating element for talking about the data itself.”
If that is really the case and data visualization can really become a more pragmatic and convincing tool for solutions-oriented dialogue, well, that is a trend I could get behind. If infographics could effectively ask the question, “What’s next?” that, in my mind, would be a very worthy end product. Until then, though, I’m mostly stuck with “so what?”
. . .