The term “data journalism” is the new buzzword — at least in developing countries like Pakistan.
There is also a wide range of free, easy-to-use online tools for everyone to unlock data sets and tell stories in different ways. This has opened up new opportunities for journalists, even in places where newsrooms don’t have the resources or the will to create large data journalism teams — all we need to do is to explore.
What is data journalism?
One of my favorite definitions of data journalism comes from Simon Rogers, a data editor at Google, who has written: “Data journalism is about using numbers to tell the best story possible. It is not about maths, or drawing charts or even writing code. It is about telling stories first and foremost – the maths and the charts and the code are all in service to that.”
This removes a lot of the hesitation and fear that some journalists might feel about telling stories through data. It’s true that some reporting projects involving big data sets will probably involve a team of several people: journalists, coders and programmers. But a journalist working on his or her own can do a lot with simple data analysis, in order to tell stories differently and effectively. Even though there are some very sophisticated data journalism stories out there, those new to the field shouldn’t feel intimidated. In the end, it’s about making data meaningful and telling the story behind the numbers.
Finding and extracting data
Despite RTI laws and the movement toward more open governments, access to information remains a challenge for journalists. Oftentimes, online data is locked in formats like PDF files, which make it tough for journalists to access. But thanks to advances in technology, our ability to convert such files into spreadsheets is also becoming easier.
One of the easiest places to start looking for data sets is the statistical office of the United Nations Educational, Scientific and Cultural Organization (UNESCO). You can download data sets by country or by four major themes — education, science and innovation, culture and communication.
In terms of extracting data from PDF files into a CSV or Excel file, Tabula is one good free tool for doing so. Another free tool, Online OCR, will also extract text from PDFs and convert the data into an Excel or Word file. The tool reportedly recognizes up to 46 languages (see the full list of languages here).
Don’t underestimate the power of Excel
Despite the plethora of other tools other there, Microsoft Excel remains a powerful way to analyze and visualize data. There are helpful free online tutorials like this one (or this one) that can teach aspiring data journalists how to sort and filter their data sets, and how to create pivot tables. For more visual learners, there are also free video tutorials. Just be patient with yourself and have fun exploring and learning. Another golden rule: start small.
Data visualization and storytelling
In addition to sorting and filtering data sets, journalists can use visualizations to discover previously unseen trends and patterns locked inside datasets. You can create basic visualizations with Excel, but there’s no shortage of other free tools that can help you tell your story.
Canva is an amazingly simple graphic design software, and you don’t need to be graphic designer to use it. Google Fusion Tables is an awesome and easy-to-use data visualization application and quite good for beginners, as is Infogr.am. Pixel Map is yet another application that can be used for web or print.
The ultimate goal when using such software is to present engaging and interesting stories. Always keep in mind the fundamentals of journalism are the same with or without data.
Other ways to brush up on data journalism skills are free Massive Open Online Courses (MOOCs) offered by different organizations. Elsewhere, the Global Investigative Journalism Network has a thorough list of data journalism resources, including some in different languages like Spanish and Arabic.
And yes, this Data Journalism Handbook is worth reading.
Khalid Khattak is a journalist based in Lahore, Pakistan. He is a staff reporter at The News International. You can find more about him on his website Data Stories, which he founded last year with an aim to promote data literacy and data storytelling.
He launched Data Stories after attending a data journalism training at the Alumni Summit held at the Center for Excellence in Journalism (CEJ) at IBA, Karachi, in July 2015, which was co-hosted by the International Center for Journalists and featured a session from ICFJ Knight Fellow Shaheryar Popalzai. He is also an alum of the U.S.-Pakistan Professional Partnership in Journalism from 2011.
Main image CC-licensed by Flickr via Chris Khamken.