Max Fairuse III, Esq.
A landmark ruling in San Francisco has delivered a split decision in a closely watched copyright case involving AI company Anthropic and a group of authors, marking a key moment in the legal battle over how generative AI systems can use copyrighted materials.
Senior U.S. District Judge William Alsup ruled that Anthropic’s use of copyrighted books to train its large language model, Claude, qualifies as “fair use” under U.S. copyright law — provided the books were obtained legally. The court found that training a model on purchased or digitized books was “exceedingly transformative” and aligned with copyright’s purpose of fostering new creativity.
Anthropic had purchased physical books, digitized them, and incorporated them into a centralized training dataset. Alsup determined this practice did not constitute copyright infringement because it involved no redistribution and served a fundamentally different purpose than the original works.
However, the judge allowed the case to proceed to trial over Anthropic’s alleged use of pirated books. According to court documents, the company downloaded millions of copyrighted works from unauthorized sources to build what it reportedly called a “central library of all the books in the world.” Judge Alsup rejected the argument that this use was necessary or justified under fair use, opening the door to significant statutory damages — potentially up to $150,000 per infringed work.
The lawsuit was filed in 2023 by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who allege that Anthropic used complete copies of their books without permission. They are represented in a class action that could set a precedent for other creators challenging AI companies over how data is collected and used in model training.
In a statement, Anthropic welcomed the court’s recognition that training on lawfully obtained material is transformative and legally protected. The company expressed disagreement with the decision to hold a trial over the pirated material, asserting that all data was acquired for the sole purpose of building AI models.
The Authors Guild, an advocacy organization supporting the plaintiffs, praised the court’s willingness to address the piracy issue and noted that damages for willful infringement could be considerable.
This ruling joins a growing list of judicial decisions exploring the contours of copyright law in the age of artificial intelligence. In a separate case this week, Judge Vince Chhabria dismissed claims against Meta over its training of AI models on copyrighted books, citing the plaintiffs’ failure to demonstrate market harm. However, that decision was limited to the facts at hand and left room for future cases to succeed with stronger evidence.
Legal experts say these early rulings are beginning to establish a framework: using copyrighted material may be lawful when obtained properly and used in a genuinely transformative way, but reliance on pirated content remains a clear legal risk.
A trial to assess damages related to Anthropic’s use of pirated material is scheduled for December.
TLDR:
A federal judge ruled that Anthropic’s training of AI models on legally obtained books qualifies as fair use. However, the court will proceed to trial over claims the company used pirated books, which could lead to significant copyright damages. The ruling offers guidance for future AI copyright cases, affirming lawful, transformative training uses while rejecting unauthorized data acquisition.