NYT v. OpenAI
NYT alleges millions of articles used to train ChatGPT.
New York Times vs. OpenAI, Microsoft
U.S. District Court, Southern District of New York
Copyright infringement: OpenAI used millions of NYT articles to train ChatGPT without permission, cr...
Billions (specific amount not disclosed)
New York Times v. OpenAI & Microsoft — The Biggest AI Copyright Case
Case Summary
The New York Times sued OpenAI and Microsoft in December 2023, alleging that ChatGPT was trained on millions of NYT articles without permission, creating a product that directly competes with and substitutes for NYT journalism. This is widely considered the most significant AI copyright case currently active.
Timeline
| Date | Event |
|---|---|
| Dec 27, 2023 | NYT files lawsuit in SDNY |
| Jan 2024 | OpenAI responds, claims fair use |
| Mid 2024 | Discovery phase begins |
| 2025 | Motions and depositions ongoing |
| Apr 2026 | Case remains in discovery/pre-trial |
Key Legal Issues
Is AI Training on News Articles Fair Use?
The central question: Can OpenAI legally copy millions of copyrighted articles to train a model that generates competing content?
NYT's Arguments:
- ChatGPT can reproduce NYT articles nearly verbatim
- The AI substitutes for NYT subscriptions (market harm)
- Training involved wholesale copying of copyrighted works
- OpenAI profits commercially from NYT's investment in journalism
OpenAI's Arguments:
- Training is transformative use (learning patterns, not copying)
- Similar to Google Books scanning (found to be fair use)
- AI outputs are original, not copies
- NYT benefits from being referenced by AI
Why This Case Matters
1. Scale: Millions of articles, billions in potential damages
2. Precedent: Could define whether AI training on copyrighted content is legal
3. Industry impact: Every AI company trains on web content
4. Media implications: Could reshape how news organizations interact with AI
The Verbatim Reproduction Issue
NYT demonstrated that ChatGPT can reproduce substantial portions of NYT articles word-for-word when prompted. This is significant because:
- It suggests the model memorized copyrighted content
- Verbatim reproduction is harder to defend as "transformative"
- It demonstrates direct market substitution
Potential Outcomes
If NYT Wins:
- AI companies may need to license training data
- Could trigger wave of similar lawsuits from publishers
- Training data costs could reshape the AI industry
- Precedent for all content creators
If OpenAI Wins:
- Validates AI training as fair use
- Content creators have limited recourse
- Accelerates AI development without licensing friction
- But may be limited to specific facts of this case
Related Developments
- Copyright Office Part 3 (May 2025): AI training not always fair use
- Judge expressed skepticism about blanket fair use defense
- Multiple other publishers have filed similar suits
Current Status
ACTIVE — In discovery phase as of April 2026. Trial date not yet set. Settlement discussions reportedly ongoing but no agreement reached.