The New York Times recently filed a lawsuit against OpenAI and Microsoft, alleging copyright infringement by using the Times’ content to train generative AI models like GPT-4 and DALL-E 3. OpenAI has responded, claiming that such use constitutes fair use of publicly available data.
OpenAI’s defense centers around the concept of regurgitation, arguing that generative AI models are less likely to exactly replicate content from a single source like The New York Times. They suggest that the examples cited by the Times could be the result of intentional manipulation of prompts by the users (TechCrunch).
This lawsuit has amplified the ongoing copyright debate around generative AI. Critics like Gary Marcus and Reid Southen argue that AI systems can produce plagiaristic content even without specific prompts, challenging OpenAI’s position (TechCrunch).
The Times is not the first to sue OpenAI over IP law violations. Notable cases include lawsuits from actress Sarah Silverman, novelists like Jonathan Franzen and John Grisham, and several programmers against OpenAI and its partners (TechCrunch).
Interestingly, some news outlets have opted for licensing agreements with AI vendors instead of legal confrontations. The Associated Press and Axel Springer, for instance, have struck deals with OpenAI. However, the compensation from these agreements is relatively small compared to OpenAI’s revenues (TechCrunch).
The NY Times had been in talks with OpenAI for a partnership but negotiations fell through. Public opinion seems to favor the publishers, with a majority agreeing that AI companies should not use copyrighted content without compensation (TechCrunch).
As the lawsuit progresses, it will be pivotal in setting precedents for how generative AI can use publicly available content. It’s a complex issue that balances innovation against the rights of content creators, and its outcome could have far-reaching implications for the future of AI and journalism.