In late 2023, bestselling author Andrea Bartz and two fellow writers filed a lawsuit in the Northern District of California against Anthropic, the artificial intelligence firm behind the Claude language model. Their allegation was straightforward but deeply consequential: Anthropic had copied their copyrighted books—along with millions of others—without permission to train its generative AI models.
The lawsuit, Bartz v. Anthropic, quickly became one of the most closely watched copyright cases in the AI sector. At its core was a novel and unresolved legal question: Does training a large language model on copyrighted books constitute fair use under the U.S. Copyright Act?
In June 2025, U.S. District Judge William Alsup issued a ruling that delivered clarity—at least in part. His decision held that AI training using lawfully purchased books qualifies as fair use, while simultaneously preserving claims against Anthropic for storing and utilizing pirated content. The ruling draws a sharp line between lawful innovation and copyright infringement, establishing a foundational precedent for the rapidly evolving relationship between generative AI and intellectual property law.
Why This Case Matters: The Collision Between Copyright Law and AI Innovation
As generative AI platforms scale rapidly, so does scrutiny over their data sources. The central legal question emerging across jurisdictions is whether copying copyrighted material—particularly books—to train AI models qualifies as “fair use” under U.S. copyright law. In Bartz v. Anthropic, Judge William Alsup provided the first detailed judicial answer.
This ruling sets a foundational precedent that may guide future AI development, litigation strategy, and public policy—not only in the United States but globally.
The Facts: Authors Sued Anthropic Over AI Training on Copyrighted Books
The plaintiffs—led by bestselling author Andrea Bartz—claimed that Anthropic unlawfully copied their works to train its Claude language model, profiting without permission or compensation. The court, however, parsed the issue carefully:
- Fair use was recognized for books Anthropic had lawfully purchased and digitized.
- Fair use was rejected for over 7 million pirated works obtained from online shadow libraries.
A separate trial is scheduled for December 2025 to determine liability and damages for the pirated content.
Factor 1: The Use Was “Exceedingly Transformative”
The first fair use factor evaluates purpose and character of the use. Here, the court embraced a robust definition of transformation. Anthropic’s Claude model did not copy books verbatim but used them to learn language patterns, allowing the model to generate novel, non-derivative text.
Judge Alsup likened the process to a human reading literature to become a better writer—a use far removed from simply copying or commercializing the original works.
Legal Takeaway: High transformation strongly supports fair use, particularly for AI training that does not mimic or compete with the original works.
Factor 2: Creative Works Matter—But Not Enough to Tip the Scale
This factor examines the nature of the copyrighted works. Fictional books are traditionally granted strong protection due to their expressive nature. However, the court downplayed this factor’s weight, especially when transformative use is present.
The transformative intent and lack of substitutive impact lessened the force of the creative nature argument.
Legal Takeaway: Expressive works may weigh against fair use, but not decisively where the use is non-exploitative and serves a new function.
Factor 3: Entire Books Were Copied—But That Was “Reasonable”
Critics argued that copying entire books should doom Anthropic’s fair use claim. But the court emphasized proportionality over quantity. For LLMs to function properly, full-text ingestion was deemed necessary. The decision echoes similar logic in Google v. Oracle, where entire software packages were copied but ruled fair use due to their reimplementation purpose.
Legal Takeaway: Quantity is not disqualifying if full use is integral to a transformative function.
Factor 4: No Market Harm, No Fair Use Violation
Perhaps the most pragmatic analysis came here: the court found no actual or plausible market harm. Anthropic’s Claude outputs did not replace the original books or reduce demand. Moreover, there was no existing market for AI training licenses for books at the time Anthropic acquired them.
Legal Takeaway: In the absence of market substitution or demonstrable harm, this factor tilts heavily toward fair use.
Where Fair Use Fails: Pirated Books and the Line of Legality
The court drew a bright-line rule: pirated content is never protected under fair use, no matter how transformative the end use. Anthropic’s acquisition of millions of unauthorized digital books for training Claude remains actionable, and a trial will determine whether statutory damages apply.
Legal Risk: Any AI company using unlawfully sourced data—regardless of intention—faces exposure to copyright claims.
Global Implications and Regulatory Crosswinds
While U.S. courts appear willing to extend fair use to AI training on lawful materials, other jurisdictions are diverging. The EU AI Act (in force from August 2025) permits training only on data obtained under Text and Data Mining (TDM) exceptions or licenses. Jurisdictions like the UK and Canada are actively reviewing their copyright regimes to address similar issues.
Strategic Insight: Multinational AI firms must develop jurisdiction-specific data governance strategies.
Key Takeaways for AI Companies and Legal Counsels
- Fair use applies to AI—but only if source material is lawfully obtained.
- Transformative intent, not output similarity, is critical in court analysis.
- Pirated content remains strictly off-limits.



