2018 has been a great year in terms of innovation and large scale collaborations for data journalists worldwide.
Machine learning, sensors, automation and new data sources are becoming more popular. We’ve gauged the state of the field, thanks to our experts:
Simon Rogers (Google, US),
Reginald Chua (Reuters, US),
Yudivian Almeida Cruz (Postdata.club, Cuba),
Kuek Ser Kuang Keng (Data Journalism Awards competition officer, Malaysia),
Cheryl Phillips (Stanford University, US),
Giannina Segnini (Columbia University, US).
All agreed to say that data teams around the world have outdone themselves this past year. “There’s been loads of interesting work, from the Paradise Papers to Cambridge Analytica to some nice satellite imagery work on Myanmar, and so on,” says Chua.
Data journalism is a growing field
Simon Rogers, data editor at Google and director of the Data Journalism Awards competition (deadline: April 7, 2019), argues we’ve definitely come a long way:
“We’re not playing around anymore. There was a phase where there were a lot of ‘let’s answer this unimportant question but WITH DATA.’ I think that’s over now — there’s just too much going on in the world. I feel like it’s finally become mainstream and widespread across the globe.”
What was this year’s top innovation? The use of machine learning techniques, which has become more frequent in many countries.Other innovative storytelling techniques have also emerged this year. Automation, or the art of using robots to facilitate large scale projects, is one of them. Segnini sees automation and new data sources are becoming more popular.
A great example by Bayerischer Rundfunk and SPIEGEL in Germany:
To prove the existence of discrimination in the German rental housing market, data journalists sent more than 20,000 applications to approximately 7,000 apartment advertisements in an automated process and evaluated the received responses. The result makes an impressive piece of data-driven journalism. You can read about the project in this article.
Collaborative data projects are on the rise
The second thing our experts observed was that there are more and more large-scale collaborations. That it doesn’t just apply to western countries, but also involves teams in Asia, South America and Africa.
There was the Implant Files, by ICIJ, in partnership with 250 journalists in 36 countries, which investigated the harm caused by medical devices that have been tested inadequately or not at all.
There also was the West Africa Leaks, investigating how Africa’s elite hide billions offshore.
The Organized Crime and Corruption Reporting Project (OCCRP) continues doing incredible work across Europe, Africa, Asia, the Middle East and Latin America, and in the U.S., ProPublica regularly publishes collaborative work through its network of publishing partners.
Also in the U.S., the Big Local News project — part of the Stanford Journalism and Democracy Initiative — aims to collect, process and share governmental data that’s difficult to obtain and analyze. The initiative is a great example of collaboration within the data journalism industry, and it will partner with local and national newsrooms to use this data to examine a wide range of issues including criminal justice, housing, health and education for accountability journalism.
Data journalism is a field that still growing worldwide
For those who still think data journalism is a thing of the West, here is something to prove you wrong.
Kuek Ser Kuang Keng shared the examples of journalists in Taiwan, where data was used almost systematically during their recent midterm elections. Almost all online news websites have used election data and maps to enhance their reporting and analysis.
The article “How do the Taiwanese media play the 2018 local elections?” by Hacks/Hackers Taipei, compiles data-driven reporting from different outlets in Taiwan.
The use of data by news teams is also growing in Cuba. “Data liberation is on the way,” said Yudivian Almeida Cruz.“It’s interesting [because] we are working on a new constitution and many different media used a data-driven approach to cover this process. We are more or less [experiencing] data liberation, people are more interested in data, and have better access to the internet. People from the government are having more presence in social networks.”
Challenges for data journalists in 2019
We’ve asked our experts to name three challenges they think data journalists worldwide will have to face in 2019.
Turning unstructured information into structured data is still a problem
With the growing amount of data readily available these days (though not always in the right format), and the great effort from journalists to collect large sets of data, comes the notion of structured and unstructured information.
If you find it hard to differentiate the two, here is a great explainer by Brandon Wolfe:
“Structured data is easily searchable by basic algorithms. Examples include spreadsheets and data from machine sensors. Unstructured data is more like human language. It doesn’t fit nicely into relational databases like SQL, and searching it based on the old algorithms ranges from difficult to completely impossible.”
“Collecting and turning unstructured information into structured data is a big challenge,” said Cheryl Phillips. “That’s because the tools are not readily available in the newsroom yet.”
Hopefully, 2019 will be the year this problem gets fixed. In the meantime, Philips encourages newsrooms to continue their hard work collecting and normalizing disparate data, with help from collaborative initiatives such as the Big Local News project and others.
Cruz argued that deep learning will have an important role to play in this challenge. It could be used to help data analysis and get insights more easily out of unstructured information.
If you’re into machine learning and looking for deep learning frameworks to play with, check out this article by James Le, “The 5 Deep Learning Frameworks Every Serious Machine Learner Should Be Familiar With.”
Access to government data in some countries is still limited
The second challenge our experts identified is a struggle in many countries, “even those with supposedly strong open records laws,” said Phillips.
Access to government data was a problem in 2018, and will still be one in 2019.
“In Southeast Asia, journalists in some countries are having a hard time, especially the Philippines and Myanmar,” said Kuek Ser Kuang Keng. “Access to government information and assessing the integrity of the information (fake or misleading data and information) are still a challenge there. But we also see huge progress in Malaysia where the new government is drafting its FOI law and reviewing its open data policy (for the better), so data journalism has a huge opportunity to grow there.”
In China, instead of getting data from the government, journalists turn to tech companies. “Some of the big tech companies are willing to share,” Kuek Ser Kuang Keng said. “For example, instead of getting traffic data from the government, [journalists] got similar data from Didi, the equivalent of Uber in China (check out the Gaiga Initiative). Waze (a popular traffic navigation mobile app owned by Google) is also sharing traffic data with media in some countries here.”
Local data journalism doesn’t develop equally in different parts of the world
We’ve seen the growth of The Bureau Local initiative in the U.K., a collaborative investigative network launched by the Bureau of Investigative Journalism. It is comprised of 833 members, and has resulted in 293 stories so far.
Local data journalism in China is also a thing, Kuek Ser Kuang Keng explained: “Local data in China is sometimes easier to obtain compared to national data. In some big highly urbanised cities like Shanghai, the local authority has higher willingness to work with journalists on data sharing.”
In other places like Cuba, it’s still a challenge. Yudivian Almeida Cruz said journalists find it hard “to cover the local news based on data, because it’s most common to have [national] or states [data]. It’s difficult to get data for local places.”
Technologies to use in 2019
Finally, we asked our experts what new ways of telling stories with data they were keen on playing with this year.
While Google recently announced its Fusion Tables will soon be gone (read Simon Rogers’ Twitter thread about this), new tools will come to newsrooms in 2019.
Here are the top three tech for this year, from Simon Rogers:
- Generative art
This article is based on a Slack discussion. Read the original chat.
This article was originally published by the Global Editors' Network (GEN). It was edited and republished on IJNet with permission.
Learn more about the 2019 Data Journalism Awards.