The News Media Alliance (NMA), a trade group of more than 2,000 newspaper publishers, has released research they say shows that AI developers rely more heavily on copyrighted newspaper stories in training their AIs than they do on generic content randomly taken from the web.

The research supports newspapers’ long-standing claim that AI developers routinely violate copyright laws, the group said.

To support the charge, NMA researchers examined publicly available information believed to be used to train major AIs such as ChatGPT. They compared that data set to generic, uncopyrighted data sets gleaned from the Internet.

The data sets believed to train AIs used copyrighted articles five to 100 times more than generic information, the NMA concluded.

Also, the researchers found instances in which an AI’s response to a query lifted exact passages from newspaper articles.

That puts AIs in direct competition with newspapers that created and own the rights to the text, the NMA noted.

“You can see our articles taken and regurgitated verbatim,” NMA CEO Danielle Coffey told The New York Times. “It demonstrates that we would have a very good case in court.”

The NMA has submitted its findings to the U.S. copyright office for review.

The alliance has complained for years that search engines such as Google display newspaper articles to readers without permission or compensation to the articles’ original publishers.

TRENDPOST: AI developers seem to be delaying as long as possible any acknowledgement that they’ve infringed copyrights owned by tens or hundreds of thousands of creators. 

It still is possible that Congress will intervene to impose a settlement by law, but that would no doubt be challenged by content owners who feel the measure doesn’t give them enough.

Rather than fend off an infinite number of lawsuits in the years and decades ahead, developers are likely to try to find a sort of “class action” solution that will offer some payment and-or credit to copyright holders. How that would be formulated or agreed to has yet to be imagined.

Skip to content