How AI can help analyze women's representation in the news

Nov 15, 2022 in Media Innovation
Women at a table

From automating content production to assisting fact-checking efforts, artificial intelligence (AI) has become more widely used in journalism in recent years. The opportunities that algorithms offer media organizations have become clearer along the way.

Among them, AI technologies can play a role in improving women’s representation in the news. The Financial Times’ bot, She said He said, was one of the first examples of this. Introduced in 2018, it helped the London-based newsroom identify the diversity of sources within its reporting.

News organizations around the world have similarly leveraged AI technologies in a variety of ways to help them analyze and identify biases in their reporting, detect hate speech against women, and more.

Here are some examples.

Analyzing biases in media

Sahiti Sarva, an engineer specializing in using data science to understand policy, co-authored a visual essay with Leonardo Nicoletti in 2021 examining the nature of women’s representation in news reporting, titled When Women Make Headlines. Sarva and Nicoletti's team analyzed more than 10 years worth of headlines from the top 50 publications in India, the U.S., the U.K. and South Africa for the project.

Sarva explained their approach at a virtual workshop hosted by London School of Economics’ JournalismAI initiative earlier this year: “We scraped all of the headlines tagged with 20 keywords synonymous with women, girl, female, and so on, and that gave us about 382,139 English news headlines.”

This was, of course, a lot of data. “You think AI is going to make it easier,” Sarva said in an interview for this article. But before one can start applying any algorithm, there’s much work to do. Namely, cleaning, removing useless words, and finding the right code packages to use. Oftentimes, she said, “you need to be creative on what data you are going to use.”

“We went ahead and created our own dictionaries that calculate something called gender bias in headlines,” Sarva explained at the JournalismAI workshop. “[This included] a combination of gendered language like actress, daughter, wife, along with behavioral and social stereotypes around gender, like emotional support and care.”

Once the team had these dictionaries, they used a machine learning method called “sentiment analysis” to understand what it looks like when women make headlines. “We found that the story — when women make headlines — is often very sensational. A lot more sensational than regular headlines that we read and, over time, the number has only gone up. This could probably be because when women make headlines the story is twice as likely to be violent than empowering,” Sarva said.

Monitoring misogynistic discourse

As public figures, women are frequently the target of attacks on social media. Who are the perpetrators? Journalists from Brazil’s AzMina, Argentina’s La Nación, and Latin America’s CLIP and DataCrítica turned to AI to find out. 

Together, they developed a web application to uncover hate speech against women on Twitter. “Aware of the escalation of hate speech, particularly against women, our project wants to be able to quickly and assertively monitor when any of these misogynistic attacks is initiated by a politician,” said AzMina’s Bárbara Libório at the same JournalismAI workshop.

The attacks, the app found, came from politicians in particular, among other public figures. “[This] initiated the real waves of hate speech,” Libório explained, “because their supporters decided to attack these women at a higher rate.”

If you want a model to detect misogynistic messages, she continued, the AI technology must learn what hate speech against women is. As the first step in the process, Libório’s team created a database of examples, marking tweets as misogynistic or not. 

Once the AI learned this process, her team evaluated how effective it was at identifying hate speech against women. They created a scoring system, testing it in Portuguese and Spanish.

With their model ready, they created a web application to help users analyze text and files. Libório hopes they can share their prototype with other initiatives that want to map gender violence on social media.

A starting point

“AI technologies can be of great help to support journalists to do their jobs better,” said Sabrina Argoub, program manager at JournalismAI. When it comes to women’s representation in news, she said, AI can be used to enhance transparency and accountability, while raising awareness. The AIJO project, which compares the rate at which men and women, respectively, are quoted in articles or depicted in visual news, is one example.

AI isn’t a cure-all, however, Argoub noted. Buy-in from newsrooms and journalists alike is necessary to realize true progress on the matter. 

“It’s good to keep in mind that the machine can provide the data and help review how well or not we’re doing,” she said. “To take action, the starting point and the intention to tackle women’s underrepresentation in news need to come from newsrooms and journalists themselves.”


Photo by Alexis Brown on Unsplash.