AI startup Anthropic has agreed to settle a lawsuit by paying authors $1.5 billion over accusations it illegally used copyrighted works to train its AI models.

Companies in the AI space have been accused in the past of scraping content made by others without permission, with the aim of teaching its large language models (LLMs). While Apple is among those who try to be as ethical as they can with AI training, some take other paths.

In one case, Anthropic is prepared to pay a considerable sum to end a major lawsuit.

Anthropic agreed to settle a lawsuit claiming it pirated content from authors to train the Claude chatbot, reports Reuters. The agreement proposal was released on Friday, and will see Anthropic pay out at least $1.5 billion, with interest.

The payout to authors will be in the region of $3,000 per book. It is anticipated that the creators of approximately 500,000 works will receive payouts, with Anthropic paying $3,000 per work if submitted claims go beyond that total.

Anthropic has also agreed to destroy the data it had collected for training.

The agreement isn't final, as it is currently still a proposal until the court approves it. That hearing is expected to take place on September 8.

Anthropic said that it is still committed to "developing safe AI systems" to advance scientific discovery and solve complex problems. However, the press release didn't admit liability.

An expensive mistake

The lawsuit dates back to 2024, with Andrea Bartz, Charles Graeber, and Kirk Wallace taking on the company over accusations it used pirated materials.

This didn't just include downloading works, but also the creation of digital copies of printed books.

In June, Anthropic caught a break after a ruling effectively gave companies free rein to train with practically any material they wish to harvest. However, at the same time, the authors did manage to convince the court that creating a library of pirated digital books, even if not used for training models, does not constitute fair use.

A trial was being scheduled for December to discuss the amount Anthropic owed authors over the privacy allegations. Had that court action continued, Anthropic could've stood to pay many times more than the settlement amount.

The lawsuit does provide a warning to companies like Apple who are working on their own AI projects. Sourcing learning materials continues to be a challenge, and with the threat of a hefty settlement, they need to be sourced more carefully.

Despite the complaints from copyright holders, some firms are still grabbing whatever they can.

In August, Perplexity was caught ignoring robots.txt, a feature of websites that should limit bot crawling. It was found to be using two bots, with one adhering to the limiting file and another bypassing it.