Updated Dec. 13, 2013, at 7:00 a.m. Eastern.
Journalists who work with recorded sound know just how painstaking it can be to sift through hours upon hours of audio to find the right clip for a story. Then comes the time-consuming task of transcribing and later, figuring out where you've stored everything.
A new project called Pop Up Archive makes the job of the radio journalist in the digital age a little easier. The sound library makes it possible for users to store, search, access and reuse audio files on the web.
The Knight News Challenge-winning project’s founders, Anne Wootton and Bailey Smith, met in 2011 at the UC Berkeley School of Information, where they set out to help radio producers digitize and access old content. Their project has grown into a free, open-source, scalable archive system for broadcast content. It includes a growing archive of thousands of hours of searchable sound from media producers and oral history collections. The site will soon work with Spanish language audio, too.
Wootton recently talked with IJNet about how journalists can use Pop Up Archive.
IJNet: What is the key problem most journalists face in terms of audio storage?
AW: It’s not a question of what to save, because digital storage is only getting cheaper. It’s a question of how you find content once you’ve digitized it.
No one likes to organize up front, so audio files end up forgotten on hard drives and servers. And then to what extent can we protect ourselves from a hard drive that becomes defunct? Today, it’s easier to lose a WAV file than a cassette tape. So, how do we better access these media formats? Where is raw material going and how can we reuse that material?
IJNet: How can Pop Up Archive help?
AW: If you have an Internet connection you can drag and drop any file. If you want to store info publicly, it’s all free. If you need to store material privately you can do that, too. [Every user gets two hours free, with paid plan options]. Beyond that, the storage is made meaningful through a variety of services and tools.
Our technology takes an audio file and “listens” to it, tagging and time-stamping it, thus making it searchable by keywords. We send it to a web service that automatically transcribes the sound through complicated algorithms to recognize them and give them a confidence rating, and then we send the transcript and “bag of words” to other services that pull out more semantic meaning--like names, places, subject topics or names of newspapers. We recommend those to you as tags you can either approve or reject.
It’s worth noting that auto-transcript is good for baseline transcription, in the absence of paying significant fees, but accuracy can range widely. If you want some refinements to your transcript, especially to material that is culturally significant, we’ve partnered with Amara, a nonprofit that has done a lot of work with video captioning.
IJNet: How might a journalist use Pop Up Archive?
AW: After you conduct interviews, you can upload and organize the audio by person or subject and create what we call a “collection.” You will get an automatic transcript back for each file, with finding aids around it. You can name each file and add additional data or fields, like the name of the interviewer or details about the location of the interview. The transcript is processed in real time (if it’s 30 minutes of audio, it will come back in 30 minutes), with the first few minutes [coming] back to you right away, and then an email when it finishes.
You can edit the transcript, search within it and then across it. You can also use the transcript and time stamps as logging tools to help you identify moments in the audio where you want to edit or cut something. And at the end of the process, you can upload a finished piece, so it’s living side-by-side with your raw material.
A newsroom can create an organizational account that offers team access, so a group of reporters could access a centralized workspace where they can add audio that is accessible and searchable, edited by any of them. Teams can even get creative and edit transcripts concurrently.
Then, when it comes to presentation, tags and transcript can be useful for SEO purposes. We're working on tools so you can do things like embed small segments of your audio for people to hear on your website, or put it into SoundCloud.
IJNet: But isn’t video, not audio, the new frontier?
AW: It’s true that video, not audio, has received the lion’s share of attention on the web. But audio has a powerful hold over a listener. And the spirit of radio still plays a big role in news and entertainment. Until recently, web technologies haven’t supported this -- we’ve been tied to devices that have big screens. With smaller devices and the Internet increasingly in our pockets, we have an amazing opportunity to benefit from a totally new audio experience.
To learn more, visit Pop Up Archive at www.popuparchive.org/.
Jessica Weiss, a former IJNet managing editor, is a Buenos Aires-based freelancer.
Photo courtesy of Flickr user Josh Lloyd under a Creative Commons license.