In Part 1 of this series on fair use in training large language models (LLMs), we discussed Judge Alsup’s decision of Bartz v Anthropic, which found that copying books to train an LLM was fair use, but using pirated books to create a central library was not.1 

Mere days after Anthropic was released, another decision considered fair use in training LLMs. In Part 2 of our series, we summarize the key findings of Judge Chhabria’s decision of Kadrey v Meta.2 Stay tuned for Part 3 where we contrast the two decisions and draw connections to Canada.


Background

In Meta, 13 authors accused Meta of using their works to train its LLM model, Llama. To do so, Meta downloaded “shadow libraries,” i.e., repositories of books (some pirated) that included the plaintiffs’ works. 

The plaintiff authors moved for summary judgment on infringement and Meta brought a cross-motion for summary judgment on fair use.

Ultimately, Judge Chhabria found that: (i) Meta’s use of the works to train an LLM was fair use given the transformative nature of the use and it would not impact the authors’ market; (ii) the authors’ potential market for licensing their works to train AI was irrelevant to the fair use analysis; and (iii) AI innovation would not be impeded by a finding against fair use. 

The use was transformative and would not impact the authors’ market

Similar to Anthropic, two key factors animated Judge Chhabria’s finding that the use of the books to train Llama was fair use: (i) training LLMs is highly transformative; and (ii) the AI software would not impact the market for the plaintiffs’ books as it could only reproduce a maximum of 50 words from any of them.

Notably, as part of its cross-motion, Meta led evidence that its use of the plaintiffs’ books in training Llama had no effect on the plaintiffs’ book sales.3 The plaintiffs led no evidence to the contrary and did not meaningfully address this argument. Judge Chhabria noted the concept of market dilution was highly relevant to the fair use analysis given LLMs’ ability to flood the market with competing works. Based on the record before him, Judge Chhabria found in favour of Meta; however, he acknowledged the result might be different with stronger evidence:

“Because the issue of market dilution is so important in this context, had the plaintiffs presented any evidence that a jury could use to find in their favor on the issue, factor four would have needed to go to a jury. Or perhaps the plaintiffs could even have made a strong enough showing to win on the fair use issue at summary judgment. But the plaintiffs presented no meaningful evidence on market dilution at all. Absent such evidence and in light of Meta’s evidence, the fourth factor can only favor Meta. Therefore, on this record, Meta is entitled to summary judgment on its fair use defense to the claim that copying these plaintiffs’ books for use as LLM training data was infringement.”4 

The court rejected the plaintiffs’ claim that Meta’s use affected their ability to license their works for the training of AI

In Judge Chhabria’s view, the plaintiffs’ potential market for licensing their books to train AI was irrelevant given this was the very use that Meta claimed was fair. In other words, the plaintiffs’ argument was circular – whether the plaintiffs had a right to license their books to train LLMs depended on whether this use was fair. This is the same finding that Judge Alsup made in Anthropic when he held “such a market for that use is not one the Copyright Act entitles Authors to exploit.”5

As we reported here, the US Copyright Office’s recent report on Copyright and AI encouraged the practice of voluntary licensing of copyrighted works to train AI models to continue to develop. Judge Alsup’s and Judge Chhabria’s findings may reduce incentives to enter these types of licences; however, the law is still sufficiently unsettled that licensing may make sense if only to reduce legal risk and uncertainty.

The court was unpersuaded that AI innovation would be stifled 

While Judge Chhabria found in favour of Meta, he rejected its submission that preventing Meta from using copyrighted books to train its LLMs would impede technological progress. Judge Chhabria noted these AI products are expected to generate potentially "billions, even trillions, of dollars" for the companies developing them and found the suggestion that copyright enforcement would "stop this technology in its tracks is ridiculous." Judge Chhabria further clarified that a ruling against fair use does not mean AI companies must cease developing their technology. Instead, it means they need to pay for licences to use copyrighted materials as training data.

Conclusion

Similar to Anthropic, Meta found that using copyright-protected works to train LLMs is fair use based on the transformative nature of the LLMs and the lack of impact on the authors’ market. While the ultimate holdings in these two decisions are similar there are important differences. In our Part 3 of our series, we explore these differences and consider potential connections with Canada’s law on fair dealing.


Footnotes

1   Bartz et al v Anthropic, 3:24-cv-05417-WHA.

2  

Kadrey v Meta Platforms, Inc., 3:23-cv-03417-VC.

3  

Meta, p. 33.

4  

Meta, p. 40.

5  

Anthropic, p. 28.



Recent publications

Subscribe and stay up to date with the latest legal news, information and events . . .