Japanese newspaper uses AI to increase speed and accuracy of summaries

بواسطة Tim Hornyak
Oct 30, 2018 في Specialized Topics

In another step forward for robo-journalism, a regional newspaper in Japan is rolling out an artificial intelligence system that automatically generates summaries of news articles for distribution across a range of media platforms.

The Shinano Mainichi Shimbun teamed up with Fujitsu, Japan’s largest IT services company, to create the software based on technology developed by Fujitsu Laboratories. Staff at the broadsheet have been producing summaries manually, a task that takes up to five minutes per article. The software creates summaries instantly and with greater accuracy than a different summarizing method that begins with the lead and stops when the word limit is reached, according to Fujitsu.

The system uses a combination of natural language processing and machine learning to pick out the most salient parts of the article, scoring each sentence in terms of importance.

During a trial, it was trained on a dataset of 2,500 articles from the newspaper as well as their manually compiled summaries.

“By pairing the original articles with the summaries and defining that as reference, or teacher data, we built an ‘important sentence extraction model’ that evaluates the content importance according to individual sentences, as well as a ‘sentence-shortening model’ that maintains sentence structure while deleting unnecessary words,” says Masato Yokota, a director at Fujitsu’s State Infrastructure and Finance Business Group.

The software can work with articles written in Japanese or English. It was built with a web API that can be easily inserted into the existing editorial workflow. A “summary” button activating the API was implemented into the editing screen for the paper’s cable TV news, Yokota said.

A screenshot of the AI system from its trial period shows the original article in Japanese (left), an automatically generated ranking of sentences by importance (center), and the summarized text (right).

Robots vs. Journalists

First published in 1873, the Shinano Mainichi Shimbun is one of Japan’s oldest dailies. Headquartered in Nagano, northwest of Tokyo, it claims a morning-edition circulation of 487,000 copies and distribution to 61% of households in Nagano Prefecture.

“The third-wave AI is set to become a trend of great relevance, and now is the time to make concerted efforts in improving the newspaper production workflow as well,” says Hiroshi Misawa, the paper’s managing director.

The Shinmai, as it’s known, plans to roll out the system in April for its cable TV news summary service, with an eye to speeding up news updates.

The summarizing AI joins a host of other automated news applications sometimes described as automated or augmented journalism. Heliograf, the Washington Post’s own news bot, produced about 300 briefs on the Rio Olympics of 2016, and has since covered U.S. elections and high school football games; it produced about 850 articles in its first year, according to Digiday. The Associated Press worked with AI firm Automated Insights to deploy software to cover earnings reports.

“Through automation, AP is providing customers with 12 times the corporate earnings stories as before (to over 3,700), including for a lot of very small companies that never received much attention,” AP global business editor Lisa Gibbs was quoted as saying in a 2017 report.

“With the freed-up time, AP journalists are able to engage with more user-generated content, develop multimedia reports, pursue investigative work and focus on more complex stories.”

This article originally appeared on The Splice Newsroom. It has been republished on IJNet with permission. 

Main image CC-licensed by Pixabay via geralt.