Newsrooms globally have begun exploring ways to convert their journalism into different formats using AI: for example, from text articles to videos, podcasts, infographics and more.
As they do so, the core challenge isn’t just accuracy – it’s rigor. Journalists strive to get facts right and attribute them clearly, avoid bias, verify claims, and maintain transparency. When AI is used to convert a work of journalism from one form to another, the same rigor may not carry over.
“One of the biggest challenges is that all AI models and most tools they power are not incorporated with editorial standards that we as journalists are used to,” said Sannuta Raghu, the head of Scroll.in’s AI Lab and a former ICFJ Knight Fellow. “They are trained on broad, generalized data and often optimized for language fluency and form – not for accuracy, attribution or editorial intent.”
The obstacles that Raghu encountered while iterating content using AI can, no doubt, be complicated to grasp, especially for someone lacking a technical background. For me, the “flashbulb memory” – a phenomenon in which people’s vivid memories of critical life events reveal distortions when compared with the actual moment – became a useful analogy to frame the challenges journalists face when they use AI to adapt their content into new formats and designs. Researchers studied this phenomenon after 9/11, finding that their subjects shared vivid memories of the attacks with high confidence, but as they retold and repackaged their experiences, details strayed further from the truth.
Raghu is working to standardize the principle of “fidelity to source.” During her fellowship, she mapped how digital news is “structured, styled and surfaced,” and compiled her findings in what she calls a “Directory of Liquid Content.”
The directory is structured according to how users receive news and journalism online: Physical Infrastructure → Network & Transport Layer → Application Protocols → Channels of Delivery → Content Container → Content Format → Building Blocks → Devices & Interfaces. Focusing on the last five touchpoints, Raghu has created the start of a modular system that will be capable of converting one form of journalism into another.
In coming months, she will describe each entry in the directory, and codify answers to questions like:
- Is a text article a container or a format?
- Is a newsletter a channel of delivery, a container or a format?
- Is a timeline a format or a building block?
This map and its descriptions could serve as the foundational dataset to teach a model journalistic form in order to preserve the editorial reasoning that informs news design.
I spoke with Raghu about fidelity to source, news design, and journalists’ responsibility to preserve editorial standards when working with AI tools.
Here’s our conversation.
How do you define “fidelity to source?”
Raghu: Fidelity is not the same across all outputs, but its foundational concept remains the same: Does this format preserve editorial intent and the rigor of the original work? It’s about translating editorial logic for a model to output each format.
Let’s take text article-to-short video: Fidelity to source would mean fact-for-fact is similar; logic of the original narrative is preserved (a text news report will be versioned to a news video, a text explainer will be versioned to a video explainer, etc.); quotes are attributed accurately and not changed to statements in narration; facts are not condensed or dramatized for emotional impact; words and phrases like “reportedly” or “according to the civic authorities” are not erased; and the tone matches the original.
For a text article-to-calculator (see, for example, this article converted to an interactive calculator), this would be different: Every variable and data point is tied to a reported fact in the article; disclaimers and uncertainties are incorporated; scope is preserved (the calculator won’t “hallucinate” to produce results beyond the facts in the article); and the calculator’s output and the logic behind it is verifiable.
Fidelity will need to be defined for each format we use in this manner.
What elements of a story are most likely to get distorted or lost, in your experience?
Raghu: One of the most persistent challenges I have faced is with situating time accurately in a versioned output. This is a multi-layered problem.
At a basic level, consider this simple reported fact: “On Monday, the workers called off their 21-day-long strike.” Here, this report was likely published after “Monday” (which is one time situator) or on “Monday” (which is another time situator). It has within it a concept that establishes a specific date when the strike started. This would most likely get summarized to “the workers called off the strike” by most models, erasing temporal context entirely.
At a more complex level, things get messier when retrieving background context or constructing timelines from vectorized archival data (data stored as a number sequence for computers to read). Retrieval systems (a vector database linked to a large language model with a set of prompts), return semantically similar statements regardless of when they happened. If I prompt for articles about, for example, an “IMF bailout of country X in July 2025,” I am looking for context around this specific bailout and not others that may have occurred in different years. But what I typically get are results optimized for semantic similarity and not temporal precision. This becomes a serious issue when designing timelines, backgrounders or update-driven summaries.
Quotes are also tricky across multiple versions: Flattening of quotes is a common challenge. Let’s say a minister has said, “I wouldn’t rule it out entirely but it is unlikely that we will pursue this this year.” This would most likely get paraphrased as, “The minister said the policy was unlikely.” Nested quotes are another problem: “The spokesperson told reporters, the minister has said that he has not ruled it out entirely but it is unlikely that he will pursue it this year,” would most likely get versioned to, “The minister has said he is unlikely to pursue it this year.”
In many cases, quotes are also presented as statements entirely without attribution. These distortions alter meaning and editorial emphasis.
What challenges faced by journalists working with AI inform your work?
Raghu: Models are optimized for fluency and form of the English language, and not journalistic rigor. If you look at FineWeb, a commonly used large training dataset for language models, you will notice that it contains news articles – but it captures only how well an article is written according to the rules of the English language. When journalists use these tools and models, which are based on these training datasets, there’s a risk of losing the underlying rigor that made the original “expression” trustworthy.
Another challenge is that the output looks polished and authoritative even when it misrepresents nuance. For example, most models almost always rephrase “allegedly” or “likely” into a definitive claim. Or oversimplify a quote or strip out references to time that matter. Because the versioning is based on a journalist’s own work, there’s an assumption that it is safe. But AI-assisted transformations aren’t neutral; they carry their own logic.
I think the challenge then isn’t just about preventing “hallucinations,” but also preserving meaning, intent and trust across every version of the story. This problem is exponentially exacerbated when a different language or cultural nuances are introduced, and as we know now, there aren’t enough rich datasets that capture the real-world nuances that are intuitive to us as people.
What does achieving “fidelity to source” look like in practice?
Raghu: I would answer this question with a question, which is: How do you teach journalistic nuance to a model that is very good at the English language (and getting better at other languages)? I would start by examining a model’s logic around journalistic rigor that is intuitive to most journalists.
Here are some starter questions I would ask an AI model:
- When summarizing a news article, how do you decide when to name a source and when to refer to them more generally?
- How would you summarize nested quotes? (How would you summarize indirect quotes?)
- When summarizing a quote in a news article, what are your dos and don’ts?
- How do you decide what to keep and what to leave out when rephrasing a complex news story?
- What is your understanding of the words “summarize,” “shorten,” “paraphrase,” “rewrite” and “rephrase?” (This one is my favorite.)
LMArena is a great resource to conduct side-by-side tests on models, and to use the results to create internal benchmarks around editorial caution.
The next step would be to create a versioning guide similar to an editorial style guide every newsroom already has. This would be a dataset that would account for how editorial caution and uncertainty must be dealt with based on benchmarks, how to maintain logic and sequence of events, how to respect scope and limits (how to not diminish or make an event bigger than it is), and how to deal with compression of output (summary, paraphrase, rewrite in a set number of words).
From there we must consider how to do this at scale. What standard operating procedures and checklists will we need? At what stage do we introduce a trained-journalist-in-the-loop? These are some of the questions we will need to tackle next.
Photo by Immo Wegmann on Unsplash.