Media companies blocking training of artificial intelligence tools
Copyright and access to huge amounts of data to train artificial intelligence for generative AI tools is a controversial topic for both content creators and politicians. Content creators want to be paid for this new use of huge amounts of data. The latest example is that the Guardian is joining other big media companies blocking generative AI tools makers from using the newspaper’s content to power artificial intelligence. With more controversies around copyright, the US copyright office has opened a public comment period to collect more information on copyright and AI.
The copyright office says that “over the past several years, the Office has begun to receive applications to register works containing AI-generated material.”
Legal experts writing in Harvard Business Review say that “ the legal implications of using generative AI are still unclear, particularly in relation to copyright infringement, ownership of AI-generated works, and unlicensed content in training data.”
“Courts are currently trying to establish how intellectual property laws should be applied to generative AI, and several cases have already been filed”, write Gil Appel, Assistant Professor of Marketing at the GW School of Business, Juliana Neelbauer, partner at Fox Rothschild LLP and David A. Schweidel, Professor of Marketing at Emory University’s Goizueta Business School.
A basic summary of experts’ comments so far would be that copyright has never been handed to work without a human involved.
The copyright office says it is undertaking a study of the copyright law and policy issues raised by generative AI and is assessing whether legislative or regulatory steps are warranted.
The office will use the record it assembles to advise Congress; inform its regulatory work; and offer information and resources to the public, courts, and other government entities considering these issues.
The office’s idea is that with more information it could advise on
- How AI could use copyrighted data in training;
- If AI-generated content can be copyrighted even without a human involved;
- How copyright liability would work with AI.
Deadline for written comments is October 18.
A spokesperson for Guardian News & Media, publisher of the Guardian and Observer, said:
“The scraping of intellectual property from the Guardian’s website for commercial purposes is, and has always been, contrary to our terms of service. The Guardian’s commercial licensing team has many mutually beneficial commercial relationships with developers around the world, and looks forward to building further such relationships in the future.”
The Guardian reports that according to Originality.ai, which detects AI-generated content, news websites now blocking the GPTBot crawler include CNN, Reuters, the Washington Post, Bloomberg, the New York Times and its sports site the Athletic. Other sites that have blocked GPTBot include Lonely Planet, Amazon, the job listings site Indeed, the question-and-answer site Quora, and dictionary.com.
Elon Musk-owned X, earlier known as Twitter, during the summer put a limit on how many posts a user can access per day in an attempt to block AI tool makers from using X posts to train artificial intelligence.
A different way has been chosen by US-based news agency Associated Press (AP) that announced an agreement with OpenAI, the company behind ChatGPT, to share access to news content for generative AI in news products and services.
“The arrangement sees OpenAI licensing part of AP’s text archive, while AP will leverage OpenAI’s technology and product expertise. Both organizations will benefit from each other’s established expertise in their respective industries, and believe in the responsible creation and use of these AI systems”, AP said.
Referring to doubts around genAI and trustworthiness, the news agency stressed that “AP continues to look closely at standards around generative AI and does not use it in its news stories.”
“Generative AI is a fast-moving space with tremendous implications for the news industry. We are pleased that OpenAI recognizes that fact-based, nonpartisan news content is essential to this evolving technology, and that they respect the value of our intellectual property,” said Kristin Heitmann, AP senior vice president and chief revenue officer.
“AP firmly supports a framework that will ensure intellectual property is protected and content creators are fairly compensated for their work. News organizations must have a seat at the table to ensure this happens, so that newsrooms large and small can leverage this technology to benefit journalism.”
“OpenAI is committed to supporting the vital work of journalism, and we’re eager to learn from The Associated Press as they delve into how our AI models can have a positive impact on the news industry,” said Brad Lightcap, chief operating officer at OpenAI.
The Associated Press has used AI technology for nearly a decade to automate some rote tasks “and free up journalists to do more meaningful reporting”. AP began automating corporate earnings reports in 2014 and subsequently added automated stories previewing and recapping some sporting events.
Additionally, AP uses AI technology to aid in the transcription of audio and video from live events like press conferences.
Moonshot News is an independent European news website for all IT, Media and Advertising professionals, powered by women and with a focus on driving the narrative for diversity, inclusion and gender equality in the industry.
Our mission is to provide top and unbiased information for all professionals and to make sure that women get their fair share of voice in the news and in the spotlight!
We produce original content, news articles, a curated calendar of industry events and a database of women IT, Media and Advertising associations.