The world of data journalism can be daunting for many. However, anyone can pitch and produce a data journalism story, if they are able to grasp the right concepts.
For those just starting out in data journalism, and even experienced journalists looking to start reporting with data, these six tools can help better execute your stories and elevate your pitches to editors:
Coding is a necessary evil
Having a coding language in your back pocket is crucial. Even the most basic coding knowledge can impact your work by allowing you to get away from simple spreadsheets, which while commonly used in data journalism, are not preferred for in-depth analysis because they cannot handle as much information.
Coding will allow you to manipulate your data more easily. It's a rarity that a dataset will ever be completely clean or exactly what you need. Being able to quickly clean and comb through even the smallest of files can resolve some of the biggest headaches.
Not all coding languages are created the same. Python is the easiest and most flexible language to learn. This language is also free, and can be learned much faster than R or C++. Another benefit to Python is it’s easier to find answers online to error codes when they arise, and offers more free help through easily searchable online forums.
Knowing how to visualize your stories is important, even if you do not plan on using them further
Editors need to see what your data looks like, so even if you do not plan on having visualizations in your story, being able to portray the information you will use visually will only strengthen a pitch. It also shows to the editor your understanding of the story and the data behind it.
The most commonly used visualization tool is Datawrapper, a free to use platform that allows data to be easily transferred. It is easy to learn and can be used to create everything from simple graphics to more complex maps. Datawrapper uses basic HTML, but provides guides on how to write HTML code in its interface.
Putting your datawrapper graphics into an Adobe software can also elevate the graphics but will get more complicated. Adobe has free trials, which can be used to help identify if paying for a membership is worth it. As adobe is more difficult to learn than Datawrapper, it is better used as a supplement unless executing for a project, rather than for a pitch.
Colors matter in visualizations, even in drafts
That art class you took in high school is extremely beneficial when designing data visualizations.
Colors are a crucial tool in data visualization because of how our brains work. We understand colors as another piece of information that can either effectively portray information or completely distort the data. Everything, down to the opacity of the color, can affect a reader’s comprehension of the data they’re visualizing.
Colors like gray, white or black should always be used in moderation or when trying to point the reader to a specific data point. Getting a grasp on how to use colors can be easily done, and is extremely important.
There are multiple codes that can produce color such as Hex, RGB and HTML. These codes are used to tell the computer exactly which one of the million colors and hues to choose, with the most commonly used HTML. Memorizing the six digit codes for every color is not necessary, but knowing that they exist is, especially if you’re trying to replicate an exact color or match colors across different data visualizations.
In order to get a color code, there are several online tools to use. HTMLColorCodes.com is a helpful color code generator as it is free, but there are many more available. Getting comfortable with a few schemes and having a few presets you enjoy will be helpful in the long run.
How you approach a story means everything
When trying to pitch a story, think about the finished product before the data. Unlike a typical story, where you can build the article from the ground up, data reporting relies more on your capabilities to work with available data than how much information a set has.
While there are thousands of databases at your disposal, you will want to know how you plan to use a data set before utilizing it. This will help assess your capabilities before finding a brilliant dataset you want to do amazing things with, but could be too troublesome or difficult to work with.
To do so, set a thesis you plan on trying to test and the ways you will go about doing so. Create a storyboard that maps out how you want to test this and what possible data is available. Then, find your data and start your interviews.
Using a tool like Trello will help with this. It is free for personal use, but not for teams.
Requesting data can be your best friend, or biggest enemy
Requesting data from a government entity or organization can be a great way to get a story, but it comes with all of the added issues that any typical document request would. This is especially true with datasets that can be unnecessarily messy.
It's important to realize that while you may request very specific data in a specific format, the data may still not be completely usable. Spending that time to request the data, then waiting to receive it can be costly,especially when freelancing.
Instead of requesting data, finding open data online may be the most worthwhile. For the United States, most states have an open data website that has nearly everything you need without having to go through a FOIA request.
Having a backlog of data is important, even if you do not plan on continuing with data reporting
For journalists who are not well-seasoned data gurus, having datasets in hand and ready to go is key. All data journalists keep them, and having your own personal library is necessary.
Creating a free Github account to use as a library for your data can be helpful to use as a reference. It can also build up your portfolio, as well as serve as a separate website for possible employers to see what you have done.