
Publication
Blue Bonds: Making a splash in the Capital Markets
In 2018, the Republic of Seychelles launched the first-ever “blue bond”, with the support of the World Bank Group and the Global Environment Facility.
Canada | Publication | June 5, 2025
The US Copyright Office recently released a pre-publication draft of Part 3 that addresses issues arising from using copyrighted works in developing and training generative AI systems, including infringement, fair use, and licensing.1
We previously commented here on Part 2 of the Report on Copyright and AI (Report) by the US Copyright Office (Office).
Key conclusions by the Office include that generative AI training engages various rights, fair use is flexible enough to address issues raised by generative AI training, and voluntary licensing arrangements are the preferred mechanism for authorizing access. Below we summarize Part 3 of the Report and outline some practical tips and considerations in view of the Report.
Generative AI models, including large language models (LLMs) and image generators, are developed through machine learning techniques. Central to improving their functionality is training on vast datasets. Many of these datasets may comprise copyrighted works. The training process involves multiple phases, including pre-training to learn general patterns, and post-training or fine-tuning for specific tasks.
Developers scrape, filter, clean, curate, and aggregate data into vast training datasets on which generative AI models learn to predict different outputs based on the data inputs and model “weights” optimized to achieve desired outputs.
The Office identifies several aspects of AI development that may implicate copyright owners’ rights:
The fair use doctrine is a defence to copyright infringement. Multiple factors are considered in a fair use assessment. The pre-publication guidance explores the factors in depth under Section IV.
Factor One (whether the infringing use has a further purpose or different character − transformativeness and commerciality) was considered in great detail in view of a recent Supreme Court decision. The Office noted that while adding new expression can be relevant to evaluating whether a use has a different purpose and character, it does not necessarily make the use transformative. While transformative use is not an exception to infringement, it is an important factor in the fair use analysis, and co-linked with the commerciality factor.
The Office rejects two common arguments about the transformative nature of AI training, that: (1) using copyrighted works to train AI models is inherently transformative, and (2) AI training is inherently transformative because it is like human learning.
The Office further provided that, “In the Office’s view, training a generative AI foundation model on a large and diverse dataset will often be transformative,” as “[t]he process converts a massive collection of training examples into a statistical model that can generate a wide range of outputs across a diverse array of new situations,” that is “meant to perform a variety of functions, some of which may be distinct from the purpose of the copyrighted works they are trained on.”
However, the Office further noted “although transformativeness often leads to a finding of fair use, not every transformative use is a fair one,” and “[u]ses that merely change the medium, or spare the user inconvenience, are not transformative.” The Office takes a nuanced view of this exception for generative AI, noting that “generative AI models may simultaneously serve transformative and non-transformative purposes.”
The Office considered retrieval-augmented generation separately and noted that “use of RAG is less likely to be transformative where the purpose is to generate outputs that summarize or provide abridged versions of retrieved copyrighted works, such as news articles, as opposed to hyperlinks.”
The Office focused most on whether the output would compete with the original work. Training a model for research may be highly transformative, while training models to generate outputs that compete with or closely resemble copyrighted works is not.
The commerciality of the use and whether access to the training data was lawful or authorized are also relevant considerations. Ultimately, the Office noted that some generative AI training will qualify as fair use while others will not, and the court will have to weigh all relevant factors in any given case. The Office identified issues with “data laundering” by non-commercial entities (e.g., academic and non-profit researchers), and noted that “commerciality does not turn solely on whether an organization is designated as ‘profit’ or ‘nonprofit,’ but whether the use itself furthers commercial purposes” and “the nonprofit status of an organization should not in itself preclude a finding of commerciality.”
For pirated or illegally accessed works, the Office’s view was that “knowing use of a dataset that consists of pirated or illegally accessed works should weigh against fair use without being determinative.”
Factor Two (nature of the copyrighted works) was briefly considered and it was simply noted that “—the facts will vary depending on the model and works at issue,” contrasting highly creative works like novels alongside with those with more factual or functional content, like computer code or scholarly articles.
Factor Three (amount and substantiality of the portion used in relation to the copyrighted work as a whole) is relevant in relation to the nuances of how the training set is curated or accessed. The Office noted that “[d]ownloading works, curating them into a training dataset, and training on that dataset generally involve using all or substantially all of those works. Such wholesale taking ordinarily weighs against fair use,” and that “[t]he Office agrees that the use of entire copyrighted works is less clearly justified in the context of AI training than it was for Google Books or a thumbnail image search.” Finally, the Office concedes “the use of entire works appears to be practically necessary for some forms of training for many generative AI models.”
For Factor Three, the Office also noted that “the third factor may weigh less heavily against generative AI training where there are effective limits on the trained model’s ability to output protected material from works in the training data” (emphasis added), and “[a]s in the intermediate copying cases, generative AI typically do not make all of what was copied available to the public.”
Factor Four (effect on potential market - lost sales (e.g., market substitution), market dilution, and lost licensing opportunities) was also considered in detail. For RAG in particular, it was noted that “retrieval of copyrighted works by RAG can also result in market substitution,” and “RAG augments AI model responses by retrieving relevant content during the generation process, resulting in outputs that may be more likely to contain protectable expression, including derivative summaries and abridgments.”
The Office took a wide perspective in terms of defining the “effect” on the market, rejecting a narrower view proposed by some commenters that the fourth factor analysis considers only harm to markets for the specific copyrighted work.
In terms of lost licensing opportunities, the Office noted that “[m]any commenters stated that individual and collective licenses for AI use were already in existence or under development” (in respect of public licensing deals), taking the position that “[w]here licensing markets are available to meet AI training needs, unlicensed uses will be disfavored under the fourth factor.”
The Office considered both theoretical and practical issues relating to voluntarily licensing copyright works to train AI models and potential mechanisms for government intervention, including collective and compulsory licensing.
Overall, the Office suggested voluntary licensing was preferred and should continue to develop. The Office noted some AI models have been trained exclusively on licensed or public domain works, but recognized there remained challenges and contexts where voluntary licensing may prove infeasible. These challenges include identifying individual rightsholders and negotiating terms for large, diverse datasets.
The Office also noted that collective management organizations (CMOs) can reduce transaction costs and facilitate bulk licensing. Extended collective licensing (ECL) could be an avenue to address market failures in voluntary licensing. ECL would operate through CMOs and resemble voluntary collective licensing but with more oversight and an opt-out mechanism (rather than an opt-in mechanism). This would ensure a license could be obtained for all works of a particular class and resolve issues relating to identifying and licensing disparate works/owners.
The Office did not consider statutory compulsory licensing to be a workable solution, suggesting it should be deployed only as a last resort if market-based solutions prove unworkable.
The Office recognized that voluntary licensing may not be feasible at scale, but suggested voluntary licensing should continue to develop without government intervention and the proposed alternatives should be considered only after market failures voluntarily licensing had been identified.
Generative AI developers should consider licensing where feasible. Where it is not feasible, a careful assessment of the source, purpose, and use of copyrighted works in training AI will provide a sense of risks and whether fair use may be available. Developers should also exercise caution in sharing model weights, particularly if the model can reproduce substantial portions of copyrighted works as a result.
For more information, please contact your IP professional at Norton Rose Fulbright Canada LLP.
For a complete list of our IP team, click here.
Publication
In 2018, the Republic of Seychelles launched the first-ever “blue bond”, with the support of the World Bank Group and the Global Environment Facility.
Publication
We are delighted to be participating in Marine Money Week New York 2025. As one of the landmark events for the global shipping finance community, and with the global shipping and maritime industry at such a pivotal juncture, we look forward to catching up with clients and contacts to continue discussions around navigating the current challenges and opportunities.
Publication
On 8 May 2025, the Court of Justice of the European Union (the CJEU) delivered its ruling in case C-581/23 (the Ruling), providing guidance on one of the conditions for an exclusive distribution agreement to benefit from the block exemption under Article 4(b)(i) of the 2010 Vertical Block Exemption Regulation (the VBER)1, notably the so-called ‘parallel imposition requirement’.
Subscribe and stay up to date with the latest legal news, information and events . . .
© Norton Rose Fulbright LLP 2025