This is a cross-post of my guest blog for Harvard’s Fair Use Week Series.

The last six months or so have seen the seemingly sudden appearance of several startlingly powerful tools that create complex new textual and visual works in response to relatively simple prompts. You probably know at least a couple by name: chatGPT (for text) and Stable Diffusion (for images) are the ones that seem to have taken over my social feeds. These tools are creating a buzz in part because the works they generate are of sufficient quality that they could pass for or replace the work of humans, at least in some contexts. This raises a laundry list of policy questions, some as old as the story of John Henry (will machines put humans out of work?), others as 21st Century as data sovereignty(how can nations govern data pertaining to their citizens when it flows seamlessly around the globe?).

In copyright world - including in some the inevitable raft of lawsuits - the question has been put more narrowly: do these computer tools violate the copyrights of the works that are used to “train” them? Lots of smart people have opined on this already, so I don’t want to go too deeply down this rabbit hole myself. The technical legal answer I favor is straightforward, and the very short version is that there’s no meaningful difference between these tools and the other “non-consumptive”/computational uses that courts have already blessed as fair use many times over. These uses are fair because precedent pretty clearly says they are. Maybe I’m being too glib about the technical legal answer, but in any case, I want to answer a different question.

Why should we embrace this (IMO) fact about the law, that fair use generally protects tools like chatGPT and Stable Diffusion against copyright liability? Even if we have legitimate concerns about the impacts of these technologies, we should recognize these are not copyright concerns and stand by fair use and the robots’ right to read. I think the answer is rooted in copyright’s purpose, and the corresponding limits in its scope.

In a nutshell, my argument is this: The exclusive rights in copyright law are not well-tailored to the law’s public interest purpose. Applied broadly and literally (I’ll call this “copyright literalism”), the exclusive rights in the law threaten to chill uses that benefit the public and that do not result in the kind of unfair competition that copyright was meant to prevent. Fair use exists in part to shield legitimate uses from copyright literalism and contain copyright to its intended domain. The application of copyright’s exclusive rights to computer-generated works is copyright literalism par excellence—it punishes literal copying even though the final result is non-infringing and the putative harm to the copyright holder (the creation of new non-infringing works that are cheaper and easier to produce) is not the kind of harm that copyright exists to prevent.

(NB: I realize that in some cases these technologies can be tricked into reproducing their training materials, and of course in these cases the outputs likely are infringing. I’m addressing here the argument that computer-generated works that are the result of a process involving “training” with in-copyright works are per se infringing.)

Article I, Section 8, clause 8 of the US Constitution gives congress the power to create copyrights (and patents). Crucially, the clause specifies the purpose of this power: “to promote the progress of Science and the useful Arts.” Granting copyrights “for limited times” (a term of 14 years at the time that clause was written) is a means to an end, which ideally congress and the courts should bear in mind as they consider how to modify or apply the law.

Congressional action has not always been guided by this principle (witness the extension of copyright term by more than a century despite little evidence of any public benefit), but courts, especially the Supreme Court, acknowledge copyright’s public interest purpose all the time. For example, here’s Justice Kagan in Kirtsaeng v. John Wiley & Sons, Inc., 136 S. Ct. 1979, 1986 (2016):

“[C]opyright law ultimately serves the purpose of enriching the general public through access to creative works.”)

And Justice O’Connor in one of my personal favorites, Feist Pubs., Inc. v. Rural Tel. Svc. Co., Inc., 499 U.S. 340, 349 (1991):

“The primary objective of copyright is not to reward the labor of authors, but ‘[t]o promote the Progress of Science and useful Arts.’”

And Twentieth Century Music Corp. v. Aiken, 422 U.S. 151, 156 (1975):

“[P]rivate motivation must ultimately serve the cause of promoting broad public availability of literature, music, and the other arts.”

And Fox Film Corp. v. Doyal, 286 U. S. 123, 127 (1932):

“The sole interest of the United States and the primary object in conferring the monopoly lie in the general benefits derived by the public from the labors of authors.”

The consequences of all this for fair use become clear in a pair of Supreme Court cases that enshrine fair use (alongside the idea/expression dichotomy) as a core, constitutionally-mandated element of the copyright law.

Public Interest Safety Valve(s)

Two cases sought to challenge the unprecedented expansion of copyright’s length and strength at the end of the 20th century. Eldred v. Ashcroft challenged the retroactive addition of 20 years to existing copyright terms, then Golan v. Holder challenged the restoration of copyright for works that had previously entered the public domain. In both cases the challengers argued that the law had intruded impermissibly on the public’s constitutional interests by starving the public domain, but in both cases the Supreme Court declined to second guess congress’s judgment.

To soften these blows to the public’s constitutional interest in copyright, the Court highlighted in Eldred (and reiterated in Golan) the presence of two key “First Amendment accommodations” in the law: fair use and the idea/expression dichotomy (the principle that copyright does not protect abstract ideas, only particular creative expressions). These doctrines ensure that even during the term of copyright, the public has some leeway to use copyright-encumbered works.

This is important because the literal scope of the exclusive rights in copyright are breathtakingly broad - reproduction, distribution, adaptation - there is hardly anything you can do with a copyrighted work that doesn’t involve one of these activities, especially in a digital context. And copyright infringement is what’s called a “strict liability” offense—there is no requirement that the alleged infringer have a bad intent in engaging in any of these acts. If not for fair use (and the body of other limitations and exceptions, including the idea/expression dichotomy), copyright would be a breathtakingly powerful private right to control others’ engagement with culture and knowledge.

One more thread bears surfacing in this conversation: the role of copyright and fair use in fostering competition. The Supreme Court emphasized this role in its most recent fair use opinion, Google v. Oracle. In that case, Justice Breyer describes fair use’s role in the context of software copyrights:

fair use can play an important role in determining the lawful scope of a computer program copyright… It can distinguish between expressive and functional features of computer code where those features are mixed. It can focus on the legitimate need to provide incentives to produce copyrighted material while examining the extent to which yet further protection creates unrelated or illegitimate harms in other markets or to the development of other products. In a word, it can carry out its basic purpose of providing a context-based check that can help to keep a copyright monopoly within its lawful bounds.

As examples of how fair use has played this role in the past, Justice Breyer cited cases like Sony v. Connectix and Sega v. Accolade, cases where software engineers made copies of protected works in a process that resulted in the development of new, non-infringing software. Yes, these cases say, there is literal copying involved in this process, but the end result (and the only thing offered to the public in competition with the works that were copied “behind the curtain”) is something new and non-infringing - exactly the kind of creativity copyright is meant to promote, not discourage. So, fair use acts as a context-based check on the otherwise overly broad literal scope of copyright’s exclusive rights, shielding these intermediate, back-room, pro-competitive copies from liability and enabling the creation of valuable new works.

Similarly, in the Oracle case, Justice Breyer held that Google had created a valuable new work in the Android mobile operating system, and that the use of elements of Oracle’s Java language to enable programmers to interact more easily with Android was fair.

If you’re with me so far, you’ve seen

  1. That copyright is for the public, and it grants rights to authors only as a means to the end of public enrichment
  2. That copyright’s exclusive rights are so broad on their face that they increasingly (and, in a digital age, inevitably) apply to uses that don’t harm, and in many cases affirmatively advance, copyright’s public purpose.
  3. That fair use is one of the most important safeguards against this outcome, and in particular,
  4. That fair use applies where literal copying happens “behind the scenes” as part of a process whose result advances the purpose of copyright, as in reverse engineering, in the search engine cases, and in Oracle.

It shouldn’t be hard to see, then, why copyright law should be positively disposed toward the application of fair use to enable the creation of tools that generate new, non-infringing works. Yes, there is literal copying involved in “training” these computer tools, but the end result is increased production of new creative works, the very public benefit that copyright is meant to encourage. These back end uses should not be cognizable as violations of copyright. The alternative is to turn copyright on its head, discouraging the creation of new works in order to vindicate a private monopoly.

Mandatory chatGPT post-script: I asked the robot “What is the purpose of copyright?” Here is its answer - not bad!

The purpose of copyright is to provide legal protection and exclusive rights to creators and owners of original works of authorship, such as literary, artistic, musical, and other creative works. The aim of copyright is to encourage and reward creativity and innovation by giving creators the exclusive right to control the use and distribution of their works for a limited time.

Copyright gives creators the ability to prevent others from reproducing, distributing, performing, displaying, or creating derivative works based on their original works without their permission. By providing this protection, copyright allows creators to benefit financially from their works and to control the use and distribution of their works.

In addition to providing financial benefits for creators, copyright also helps to promote cultural and educational development by ensuring that creators have the incentive and means to produce new works. Copyright law also helps to ensure that creators receive recognition for their works and that their works are not misrepresented or misused in ways that could harm their reputation or integrity.