In a nuanced decision that nonetheless could serve to shape the intersection of copyright law and artificial intelligence for the foreseeable future, Judge William Alsup of the U.S. District Court for the Northern District of California finds that “the purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative.”
But while his decision in Bartz v. Anthropic PBC validates legitimate AI training as consistent with fair-use doctrine, Alsup nonetheless draws firm boundaries against digital piracy—even when conducted for transformative purposes.
The Three-Part Decision
Alsup’s decision reflects a sophisticated three-part analytical framework that recognizes the distinct copyright implications of different AI-development practices. Rather than treating all copying as a monolithic “AI training” use, the court carefully parsed Anthropic’s conduct into separate analytical buckets:
- Copies used specifically to train large language models (LLMs), which the court found transformative under traditional fair-use analysis;
- The format conversion of legitimately purchased print books into digital library copies, which qualified as fair use under a distinct format-shifting rationale; and
- The creation of a general-purpose digital library using pirated materials, which failed fair-use scrutiny entirely.
This tripartite structure proves crucial, because each category of copying serves different purposes, involves different acquisition methods, and implicates different copyright concerns. Determining infringing use cases therefore requires independent analysis under the four-factor fair-use framework.
Importantly, the court’s analysis was limited to input-side copying and did not address whether AI-generated outputs might themselves infringe copyright, noting that “[a]uthors do not allege that any LLM output provided to users infringed upon [a]uthors’ works,” and explicitly stating that “if the outputs were ever to become infringing, [a]uthors could bring such a case.”
The court’s methodical separation of these use cases demonstrates how courts will likely approach future AI copyright cases: not as broad categorical determinations about AI training, but as nuanced examinations of specific copying practices and their individual justifications.
The Transformative Nature of AI Training
The court’s analysis of the use of copies in AI training represents perhaps the most significant judicial endorsement to date of AI development under fair-use doctrine. Alsup grounded his reasoning in an analogy to human learning, noting that “like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different.”
This framing proved crucial in distinguishing the case from Thomson Reuters Enter. Centre GmbH v. Ross Intell. Inc., where the court found against fair use. The key distinction lay in the nature of the AI system: Ross involved training “a competing AI tool for finding court opinions in response to a given legal topic,” which was “not transformative.” By contrast, Anthropic’s Claude represents generative AI that creates new content, rather than merely replicating the function of existing databases.
The court’s analysis was bolstered by a critical factual finding: no infringing content ever reached users. Alsup emphasized that “[a]uthors do not allege that any infringing copy of their works was or would ever be provided to users by the Claude service.” This absence of direct downstream infringement proved dispositive, as the court noted that filtering software prevented any exact copies or substantial reproductions from reaching the public.
The decision also rejected the authors’ argument that computers should be treated differently from humans in learning contexts. In particular, in rejecting the argument that an LLM learning the creative techniques of source material did not constitute fair use, the court observed that copyright “does not extend to ‘method[s] of operation, concept[s], [or] principle[s],’” thus analogizing a work’s intangible elements to a “method of operation.”
Legitimate Format Shifting as Fair Use
In a separate but equally important holding, the court found that Anthropic’s conversion of purchased print books to digital formats constituted fair use under a format-shifting theory. Drawing on precedents from Sony Betamax, Texaco, and the Google Books litigation, Alsup concluded that “storage and searchability are not creative properties of the copyrighted work itself but physical properties of the frame around the work or informational properties about the work.”
This analysis proved crucial, because it established that legitimate acquisition followed by format conversion differs fundamentally from unauthorized copying. The court emphasized that “every purchased print copy was copied in order to save storage space and to enable searchability as a digital copy. The print original was destroyed. One replaced the other.” Importantly, there was no evidence that digital copies were shared outside the company.
The decision carefully cabined this holding, noting that while the authors “might have wished to charge Anthropic more for digital than for print copies,” the U.S. Constitution’s language “nowhere suggests that [the copyright owner’s] limited exclusive right should include a right to divide markets or a concomitant right to charge different purchasers different prices for the same book.” This reasoning reinforces the first-sale doctrine, while accommodating technological needs for format conversion.
The Piracy Exception: Where Fair Use Fails
The court’s treatment of pirated library copies represents the decision’s most significant limitation on AI companies. Despite finding the ultimate training-use transformative, Alsup firmly rejected the notion that transformative downstream use can cure upstream piracy. As he colorfully noted, citing Anthropic’s oral arguments:
You can’t just bless yourself by saying I have a research purpose and, therefore, go and take any textbook you want. That would destroy the academic publishing market if that were the case.
The court identified the fundamental flaw in Anthropic’s approach: building “a central library of works to be available for any number of further uses” constituted a separate use from training LLMs. Critically, Anthropic retained pirated copies “even after deciding it would not use them or copies from them for training its LLMs ever again.” This retention for general purposes, rather than specific transformative use, proved fatal to the portion of the decision judging the fair-use defense on the storage of pirated materials.
Alsup’s analysis drew important distinctions from cases where intermediate copying was excused. Unlike Perfect 10 or Kelly v. Arriba Software, where copies were “immediately transformed into a significantly altered form” and deployed directly into transformative uses, Anthropic’s pirated copies were “downloaded and maintained ‘forever’” for a general-purpose library.
The decision explicitly rejected the argument that eventual transformative use can retroactively justify initial piracy, emphasizing that “each use of a work must be analyzed objectively” under Warhol’s framework. This holding establishes that AI companies cannot rely on fair use as a blanket justification to acquire copyrighted materials through unauthorized means.
Practical Implications for AI Development
If the Anthropic decision gains traction in other courts, it would seem to draw a clear roadmap for AI companies navigating copyright law. First, strategic sourcing beats piracy: companies should acquire training materials through legitimate channels—purchase, licensing, or authorized access. The court’s finding that format shifting of legitimately acquired materials constitutes fair use further provides a viable path for companies needing digital copies for computational purposes.
Second, output liability emerges as the next frontier. Because no infringing content reached users in this case, future litigation will likely focus on whether AI-generated outputs themselves infringe copyright. The court explicitly noted that “if the outputs were ever to become infringing, [a]uthors could bring such a case.”
Third, the decision signals potential legislative solutions. It very well may be the case that generative AI will create hurdles for creators seeking to monetize their work. But it also could be that new markets emerge to allow artists to be remunerated on, for example, some sort of new property right in “name, image, and likeness” or for a similar “creative identity.” That would, however, likely require some sort of legislative enactment, perhaps by amending the Lanham Act with an extension of the concept of trademark.
The Road Ahead
The Bartz v. Anthropic case will proceed to trial on the pirated library copies (assuming no further motions defeat it). More broadly, the District Court’s decision establishes a framework that courts will likely follow in future AI copyright cases. The emphasis on legitimate acquisition, the recognition of training as transformative use, and the firm rejection of a piracy library as a shortcut provide clear guideposts for both AI developers and copyright holders.
As the first comprehensive judicial analysis of AI training under fair-use doctrine, Bartz v. Anthropic represents a landmark moment in technology law. In validating transformative AI training, while maintaining copyright’s core protections, Judge Alsup has charted a course that promotes innovation while preserving creators’ fundamental rights—a balance that will prove essential as AI technology continues to evolve.