Marina Adami
Five AI experts weigh in: Francesco Marconi, Madhumita Murgia, Charlie Beckett and two startup founders discuss the impact of generative AI on the news industry
Since OpenAI’s AI-powered chatbot ChatGPT was launched back in November, journalists have been discussing its potential impact on the news industry.
How many journalists will be replaced by the rise of generative artificial intelligence? How fast will this process take place? Which journalists will be most vulnerable to this kind of disruption? And should we see ChatGPT as a challenge or as an opportunity to solve some of the problems the news industry faces?
As all of these questions and more are hotly debated, I spoke to three experts and two startup founders to gain a clearer idea of how generative AI and large language models are likely to affect journalism in the short and the medium term.
______
Francesco Marconi is a computational journalist and co-founder of the real-time information company AppliedXL. Previously, he was R&D Chief at The Wall Street Journal and AI and news automation co-lead at the Associated Press. Marconi is the author of “Newsmakers: Artificial Intelligence and the Future of Journalism”, a book on AI and journalism published in 2020.
Madhumita Murgia is the newly appointed AI editor at the Financial Times, a new position at the paper. Before this, she worked as a European technology correspondent at the FT.
Professor Charlie Beckett is the Head of JournalismAI, a project by the London School of Economics (LSE)’s journalism think tank, Polis. As well as conducting research and publishing a report on journalism and AI, the initiative runs a fellowship programme for journalists and technologists, a training programme aimed at small newsrooms, and curates examples of AI applications in journalism for others to learn from.
______
Many outlets are already using AI to a limited extent to assist their operations. Others are imagining whole new models based on the technology. Among the latter group are Jenny Romano and Pedro Henriques, co-founders of The Newsroom, an app that offers its readers a daily brief with AI-generated summaries of the main news stories: the key facts, the context, and the main takes.
Not entirely new
The use of AI to support and produce pieces of journalism is something outlets have been experimenting with for some time. Francesco Marconi categorises AI innovation in the past decade into three waves: automation, augmentation and generation.
During the first phase, “the focus was on automating data-driven news stories, such as financial reports, sports results, and economic indicators, using natural language generation techniques,” he says. There are many examples of news publishers automating some content, including global agencies like Reuters, AFP and AP, as well as smaller outlets.
According to Marconi, the second wave arrived when “the emphasis shifted to augmenting reporting through machine learning and natural language processing to analyse large datasets and uncover trends.” An example of this can be found at the Argentinian newspaper La Nación, which began using AI to support its data team in 2019, and then went on to set up an AI lab in collaboration with data analysts and developers.
The third and current wave is generative AI. It’s “powered by large language models capable of generating narrative text at scale,” Marconi says. This new development offers applications to journalism that go beyond simple automated reports and data analysis. Now, we could ask a chatbot to write a longer, balanced article on a subject or an opinion piece from a particular standpoint. We could even ask it to do so in the style of a well-known writer or publication.
Ideas for possible uses for this technology have multiplied since November, with journalists themselves often testing the capabilities of chatbots to write and edit.
Part of the reason why ChatGPT and other tools have generated so much excitement may be the fact that they are so consumer-friendly and can communicate in natural language, says Madhumita Murgia from the FT. “It feels like there’s an intelligence there, even though it is really still just a very powerful sort of predictive technology,” she says.
The language models these tools work with mean that they are responding to our prompts when generating new content and not coming up with the ideas themselves. The model is trained on a set of content and data and generates new output based on what it was trained on.
This means that, while it could be helpful in synthesising information, making edits and informing reporting, Murgia believes generative AI as we see it today is missing some key skills that will prevent it from taking on a more significant role in journalism. “Based on where it is today, it’s not original. It’s not breaking anything new. It’s based on existing information. And it doesn’t have that analytic capability or the voice,” Murgia says.
Because of this, she explains, generative AI can’t meet the demand for more analysis or a more developed take on a subject, something readers look for when they go to outlets like the Financial Times. ChatGPT itself seems to agree.
A screenshot of a chat exchange with ChatGPT. The question is: “Will you replace journalists in publishing breaking news?” The answer is that, while tools like ChatGPT can assist journalists in their work, they cannot completely replace them.
“That isn’t to say that [generative AI] can’t become more powerful or advance as the underlying technology evolves,” Murgia says. “I would like to be really optimistic about the original human voice, that nothing can ever replace us. And I definitely believe that, where language models are today, they are not creative or original or generating anything new in any way. But I think that they’re mimicking it pretty well.”
Another challenge to a greater role in journalism for generative AI are the factual mistakes ChatGPT often makes, sometimes even in public demos, as seems to have happened with both Google’s and Microsoft’s new AI-powered tools. ChatGPT may have pointed a reader to a reference that doesn’t exist.
“These models often have difficulty generating accurate and factual information regarding current events or real-time data,” Marconi says. This suggests that AI tools as are currently available are unsuited to breaking news reporting, a complex and expensive operation that requires careful fact-checking and cross-referencing of information.
Generative AI models have also struggled with numbers. “The new crop of generative AI is not accurate when it comes to computing exact calculations. Unchecked algorithmic creation presents major risks as it relates to a healthy information ecosystem,” Marconi says. This doesn’t mean that generative AI has no role in journalism, but that we can’t solely rely on it.
Professor Charlie Beckett, head of the Polis/LSE JournalismAI research project, also advised caution and would discourage journalists from using new tools without human supervision: “AI is not about the total automation of content production from start to finish: it is about augmentation to give professionals and creatives the tools to work faster, freeing them up to spend more time on what humans do best,” he says. “Human journalism is also full of flaws and we mitigate the risks through editing. The same applies to AI. Make sure you understand the tools you are using and the risks. Don’t expect too much of the tech.”
Marconi also argues that the media should be working with the technology in a way that acknowledges and counters its current pitfalls. “The limitations of large language models such as GPT signal where journalistic innovation should be focused on – towards the development of event detection systems that can capture and compute real-time information. Combining these event detection systems with large language models will pave the way for an entirely new approach to journalism,” he says.
An example of an event detection system is found in Marconi’s own company AppliedXL, which he describes as “an event detection company where journalistically-minded people work together to anticipate the news.” Through machine learning and the principles of investigative journalism, his team aims to anticipate news relating to clinical trials, such as flagging early irregular signals in data well before companies go public with problems.
Generative AI in action
Several well-known outlets have announced their plans to use generative AI or are already incorporating it into their content. BuzzFeed announced it will use AI to power its famous personality quizzes, and the New York Times used ChatGPT to create a Valentine’s Day message generator with a combination of prompts.
More are exploring the idea of possible uses, including German publishing giant Axel Springer and UK publisher Reach, which recently published its first articles written by AI on a local news site. The Italian newspaper Il Foglio announced a challenge for its readers: for 30 days starting in the second week of March, it will publish short texts written by AI in its daily edition, and readers who can correctly identify each text over a week are eligible to win a free subscription and a bottle of champagne.
For Pedro Henriques and Jenny Romano, the application of AI to journalism is at the core of the business of The Newsroom, the company they founded in 2021. They’ve built an app that offers AI-generated daily summaries of the main news stories. These are not breaking stories, but news that has already been widely reported by a variety of outlets. The point of the app, its founders told me, is not necessarily to bring totally new information to the user, but to paint a picture of the facts all outlets agree on, and then to highlight different perspectives.
The first step in the process is gathering data from a variety of publishers to understand what news events are being discussed and by whom, Henriques explained. The next step is to run these articles through a model the founders worked with journalists to build. The model assesses the quality of the pieces based on criteria such as the presence of facts and elements of bias.
“Once we have that set of articles about the same events and a certain quality bar on our end, we have two other models that essentially break the articles into pieces,” Henriques says. “We identify the elements of consensus around what’s being reported on. So what are the main things that all newspapers agree on? Which are the basic facts that everyone is reporting on? And on the other hand, what are the elements of divergence? So what are different views on the same topic that are emerging? Based on that, we write a new article that essentially packages that, so you start from the consensus elements, the basic facts of what’s happening, and then you can explore what we call the multiple perspectives.”
Under the ‘Multiple perspectives’ tab is also a list of publishers who are reporting on the topic being summarised, with links to their coverage.
The Newsroom’s pieces are written by AI and manually reviewed by humans. While humans will always be part of the oversight process, Romano says, they are looking to further streamline the process.
“We’re planning on having different tiers in terms of the amount of manual review that goes into it, depending on the topic,” Henriques says. “So for example, right now most of the topics we do are global topics on geopolitics, climate etc. Once we start evolving to other topics that are lower risk, like sports, for example, we plan to then have different levels of review to go into those.”
For the moment, they are only using English-language source pieces and publishing summaries in English, but they plan to include articles in other languages in their model, in order to also improve the geographic diversity of their output. This is mirrored across the industry: despite the possibility of using ChatGPT in several languages, the quality of its output is not the same across the board.
Asked if they have experienced some of the issues other models encounter, Henriques and Romano say they haven’t. Their models haven’t produced any ‘hallucinations’, when AI generates a statement not supported by the data, and that their manual review of the text counters any factual inaccuracies.
“We don’t deal with things like breaking news. When news is breaking, there’s still not enough information for us to be able to really validate it properly. And so information on The Newsroom is always a bit delayed, purposefully,” Henriques says.
The app is currently in its minimum viable product stage and is thus still being developed. It has around 1,000 users across countries, mostly in Europe, with the overwhelming majority under 35, the founders say. According to Romano, The Newsroom’s current users can be split into two main groups: people who already consume a lot of news from other sources and former news avoiders, a group they found among their audience by reaching out and speaking to some of their users.
However, Henriques stresses that the app is not designed to be the only source of news for its users. “We see ourselves very much as a way to help users navigate the news in general. We don’t see it as a fully self-contained platform, the only place you go to read the news. It’s an access point, so you go there to navigate it, but then it doesn’t end there. Then you go to other players to keep diving into things that really interest you,” he says.
A look to the future
Murgia and Marconi both mentioned journalists’ role in synthesising information, contextualising it and identifying the story. For Marconi, this is going to get harder.
“The explosion of data from sources such as the web, sensors, mobile devices, and satellites has created a world where there is simply too much information. We are now producing more information than at any other point in history, making it much more challenging to filter out unwanted information,” he says.
Marconi thinks this is a side of journalism in which AI can play a crucial role in lessening the workload for humans. “AI should not only be seen as a tool to generate more content but also to help us filter it,” he says. “Some experts predict that by 2026, 90% of online content could be machine-generated. This marks an inflection point, where we now must focus on building machines that filter out noise, distinguish fact from fiction, and highlight what is significant.”
Marconi believes journalists should play a role in the development of new AI tools. For example, by writing editorial algorithms and applying journalistic principles to the new technology. “The news industry must be actively engaged in the AI revolution,” he says. “In fact, media companies have an opportunity to become a major player in the space – they possess some of the most valuable assets for AI development: text data for training models and ethical principles for creating reliable and trustworthy systems.”