There are plenty of resources available to journalists who want to learn the tools of the data journalism trade. But there aren’t as many resources online that help us understand the basic concepts and principles that lie behind data and data-driven journalism.
Jonathan Stray, a fellow at the Tow Center for Digital Journalism, recently published “The Curious Journalist’s Guide to Data,” a free, easy-to-read online book that offers an overview of such principles.
“In journalism, a story is a narrative that is not only true but interesting and relevant to the intended audience,” Stray writes. “Data journalism is different from pure statistical analysis — if there is such a thing — because we need culture, law and politics to tell us what data matters and how.”
The book, which Stray calls “a collection of big ideas,” is split into three categories:
Quantification, or the process that creates data, is the first step to revealing trends and patterns in society that can then become a story. Yet quantification is a fragile process, Stray argues, because so much of data creation is subject to human error and other mishaps.
“The whole process has to work just right, and our understanding of exactly how it all works has to be correct, or the data won’t be meaningful,” he writes.
In this section of the book, Stray uses two real-world examples to illustrate how we can think about quantification, as well as an overview of what actually makes something “quantitative” or not.
Once you have a data set in front of you, analysis — or finding a story hiding within the data — comes into play. Data analysis can pose unique challenges because it requires more than math; it demands an intuitive sense of what makes a good story, as one data set could contain many stories.
To illustrate how data analysis comes into play for journalists, the book uses the example of data on assaults in a downtown neighborhood, and tries to find out if an earlier bar closing time really did reduce these assaults. To help journalists better understand how to analyze data, the book outlines some key statistical principles, then expands to more cutting-edge methods like Bayesian inference and causal graphs.
Once you have your data and you’ve found a story within it, it’s time to communicate this story to your audience. For data-driven stories, this often means creating a compelling data visualization that illustrates the evidence behind your story.
Yet data visualization can reinforce harmful stereotypes or cause readers to develop skewed perceptions if it isn’t executed properly. As a result, journalists aren’t just responsible for telling a story — they’re responsible for what their audience thinks after it reads the story. Stray offers examples of how journalists can ensure their data-driven storytelling best represents reality by relating the numbers to people.
Click here to read the full book.
Main image CC-licensed by Flickr via Gene Han.