Biblioracle: Authors sue ChatGPT creator OpenAI

Under the umbrella of the Authors Guild, a group of 17 prominent and bestselling authors including Jonathan Franzen, Jodi Picoult, George Saunders, Elin Hilderbrand and George R.R. Martin have sued OpenAI, the developer of large language model ChatGPT, for copyright infringement.

They allege that OpenAI has “copied” their work “wholesale” by using that work to train their models and that this copying may allow for derivative works to be generated without permission or compensation.

This lawsuit follows previous actions by other authors, including comedian Sarah Silverman and Pulitzer-winner Michael Chabon, as well as similar suits against image generating platforms such as OpenAI’s DALL-E, alleging that the work of writers and artists have been a vital component to the development of this technology, something for which permission should have been sought.

Here’s my feeling about these suits.

Good.

Now, I don’t presume to know what the outcomes of these lawsuits will be, and while my sympathies are with authors (for obvious reasons), I don’t necessarily think these AI companies must be stopped in their tracks.

But I do believe we’re probably past the time to figure out how the rights of creators can be appropriately protected in this new technological era. The U.S. copyright system is a decidedly imperfect vehicle for balancing the interests of publishers, writers, scholars and readers, but it’s what we have. Sometimes when something new comes on scene, the only way to decide what’s what is to have a fight about it. In our system, a lawsuit is a way to have that fight.

There are several different issues at work here. One is that the plaintiffs allege the AI models have been trained on pirated copies of the original texts, a form of receiving (or repurposing) stolen goods.

The biggest issue concerns what constitutes “fair use” under copyright law. Fair use allows for the use of “limited portions” of the work of others for “purposes such as commentary, criticism, news reporting, and scholarly reports.” How much are people allowed to use?

Insert shruggy emoticon. There are no hard and fast guidelines in terms of a number of words or percentage of another work. The law says fair use “depends on all the circumstances.”

You might notice that in that list of activities covered by fair use, “training a large language model” isn’t among them, but the inclusion of “such as” suggests the list isn’t exhaustive, and OpenAI will fall back on a defense that any text ChatGPT or its more advanced cousins generate will be “original” rather than a copy any author’s specific copyrighted work.

Needless to say, these are complicated matters, and the initial response from OpenAI suggests they’d like to reach some kind of compromise with the authors and the Authors Guild, perhaps in order to avoid a worst-case scenario where they must identify, notify and compensate any author whose work has wound up in the training data.

This number counts in the millions.

Perhaps they also fear the discovery process of litigation, which may require them to reveal the content of the training data, something OpenAI and other developers of LLMs have resisted thus far.

I honestly don’t know if what OpenAI has done in creating ChatGPT constitutes fair use, and I’m not sure anyone else does either. The potential existence of this type of technology was not a consideration when the laws and guidelines were written.

What I do know is that some transparency would be welcome. Allowing powerful tech companies to operate with impunity simply because they declare themselves innovative is no way to protect ourselves from exploitation.

John Warner is the author of “Why They Can’t Write: Killing the Five-Paragraph Essay and Other Necessities.”

Twitter @biblioracle

Book recommendations from the Biblioracle

John Warner tells you what to read based on the last five books you’ve read.

1. “Lillian Boxfish Takes a Walk” by Kathleen Rooney

2. “Prom Mom” by Laura Lippman

3. “The Dive from Clausen’s Pier” by Ann Packer

4. “Stones for Ibarra” by Harriet Doerr

5. “Captain Corelli’s Mandolin” by Louis de Bernieres

— Beth P., Chicago

Interesting array of books here. I’m hoping the sweet and spare story of two people finding each other late in life will be a good fit. The book is “Our Souls at Night” by Kent Haruf.

1. “When Breath Becomes Air” by Paul Kalanithi

2. “The Covenant of Water” by Abraham Verghese

3. “Demon Copperhead” by Barbara Kingsolver

4. “Tom Lake” by Ann Patchett

5. “The Lincoln Highway” by Amor Towles

— Mary M., Lake Forest

I see it’s been quite some time since I recommended “Mrs. Bridge” by Evan Connell. I’ve never met someone who appreciates carefully written books like those above who has not found the novel amazing.

1. “Holly” by Stephen King

2. “American Prometheus” by Kai Bird and Martin J. Sherwin

3. “I’m Glad My Mom Died” by Jennette McCurdy

4. “Pageboy” by Elliot Page

5. “Tomorrow, and Tomorrow, and Tomorrow” by Gabrielle Zevin

— Jillian N., Burlington, Vermont

“Perfect Tunes” by Emily Gould is a mother/daughter story set against a backdrop of music and art and I think it’s a good fit for Jillian.

Get a reading from the Biblioracle

Send a list of the last five books you’ve read and your hometown to biblioracle@gmail.com

Biblioracle: A group of boldface-name authors sue OpenAI for ChatGPT copyright infringement