In his new book, “How Charts Lie,” Alberto Cairo minces no words when laying out the dangers of poorly designed data visualizations. He identifies five broad categories of chart designs that aren’t what they seem at first glance — from those that contain insufficient data, to those that deliberately conceal or mislead the viewer.
Take, for example, visualizing the risk of dying from homicide in the United States. “If you create a chart that only includes the national murder rate, it looks like the United States is becoming more dangerous — but that’s not true,” said Cairo, who holds the Knight Chair in Visual Journalism at the University of Miami.
“Most places in the United States have very low murder rates, but some places have murder rates that are so high that they are distorting the national rate," he continued. "If you really want to inform the public, you need to include local and regional data.”
Causes of poor design
Poor chart design is often the result of of oversimplification when trying to make complex information easy to understand, according to Cairo. However, at the more sinister end of the spectrum there are charts that deliberately conceal complexity or suggest misleading patterns in the underlying data. Such charts may attempt to discredit scientific evidence, promote false claims, signify causation when it does not exist, or support specific biases.
As an example of the latter, Cairo openly explores in his book the potential influence that misleading charts circulated by racist groups in the U.S. might have had on Dylann Roof, who was convicted for the murder of nine African-American worshipers at a South Carolina church in 2015.
“I am careful with the wording of that section because [the charts are] not the reason that this person committed this dreadful act,” Cairo said. “But what would have happened if this kind of person wasn’t exposed to this misinformation?”
The role of journalists
Journalists can protect themselves from being “lied to” by charts by first accepting that they are as vulnerable as the wider public when it comes to the persuasive quality of a simple visual.
“We need to stop believing — either consciously or unconsciously — that charts are illustrations. They are not. They are arguments made visually, and they need to be assessed and verified with the same care as anything else we put into writing a story,” he said.
Verification should include finding out what data a chart is based on, whether it comes from a reputable source, whether it is based on a representative sample and if any information has been left out or glossed over that could render a visualization wrong or misleading.
Charts on survey data should also make clear to the audience whether there is any uncertainty or margin for error, which Cairo said should be routine practice when covering elections.
“How many times have I seen a headline saying Candidate A is ahead of Candidate B, but then I read the numbers and it’s 45% versus 43%? If the margin of error is four points, you can’t claim that one is ahead,” he said.
“You can either say that the candidates seem to be tied, or you go out and get more polls and aggregate those, because that will give you a much clearer picture than a single poll.”
Carlotta Dotto, a data journalist at First Draft News, said the organization is increasingly concerned about malicious or misleading information presented in charts. She said Cairo’s book will help journalists build confidence in the charts they create themselves, and fact-check charts supplied by their sources.
“We are increasingly looking at this and developing training for newsrooms, academics and students around the world on how to use data journalism both to collect and visualize data, and to spot a bad visualization,” she said.
Spotted: A map to be hung somewhere in the West Wing pic.twitter.com/TpPPDyNFtE— Trey Yingst (@TreyYingst) May 11, 2017
One of the charts Cairo reviews is this map purporting to show 2016 U.S. election results.
Cairo’s book is crammed with practical tips, making it a great resource for creating better charts, and for interpreting the charts that you receive from a source.
Here are five key takeaways:
- Get as much data as you can, and make sure it’s reliable.
- Learn the difference between the mean, which is the average of a list of numbers, and the median, which is the number in the middle of the list.
- Choose the right design for your chart. Should it be a line graph, heat map, pie chart or something else? This 2018 blog post from HubSpot has information on chart types and how to select one to best visualize your data.
- Think carefully about what you name your chart. A well-made chart can become misleading if your title or headline makes claims that the underlying data doesn’t support.
- Be willing to have your assumptions challenged. The data you gather may not support an argument you were planning to make, and may come with nuances that need to be explained.
On a positive note, Cairo said the growing number of free and low-cost data visualization tools, like Datawrapper and Flourish, has made doing high-quality, data-based stories accessible to even the smallest of newsrooms.
“We think of The New York Times as the gold standard of data visualization, but in Florida, the Tampa Bay Times has just two or three people producing this kind of work, and they’re doing Pulitzer-winning pieces,” he said. “These tools are making charts more like writing, in the sense that it’s something everybody can learn and benefit from, and I think that’s wonderful.”