Settlement 9 min read

Anthropic Offers 1.5B Settlement to Authors

Anthropic offered to settle claims from 500K authors.

Anthropic Offers $1.5 Billion Settlement to Authors Over AI Training Data

In what would be the largest copyright infringement payout in United States history, Anthropic has offered to settle claims from approximately 500,000 authors whose works were allegedly used without permission to train the Claude AI model.

The Numbers

  • Settlement amount: $1.5 billion total
  • Per author: Approximately $3,000 each
  • Authors affected: ~500,000
  • Dataset in question: The Pile (open-source dataset that included copyrighted works)

Background

In August 2024, several authors including Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson sued Anthropic for using their copyrighted books to train Claude. The works had been included in "The Pile," a dataset intended as a collection of open-source and public domain works but which inadvertently included copyrighted material.

The Legal Journey

1. August 2024: Authors file lawsuit against Anthropic

2. Early 2025: Judge William Alsup rules that Anthropic's use of purchased books for training was fair use

3. Mid 2025: Alsup rules that using unlicensed works from The Pile was NOT fair use

4. August 2025: Anthropic offers $1.5 billion settlement

5. Late 2025: Judge Alsup rejects the settlement

Why the Judge Rejected the Settlement

Judge Alsup rejected the proposed settlement over shortcomings in the settlement details that would be forced "down the throat of authors." Key concerns included:

  • Inadequate per-author compensation: $3,000 per author was deemed insufficient given the scale of use
  • Overly broad release: The settlement would have released Anthropic from future claims too broadly
  • Opt-out complexity: The process for authors to opt out was unnecessarily complicated
  • Lack of transparency: Insufficient disclosure about which specific works were used

The Dual Ruling on Fair Use

Judge Alsup's earlier rulings created an important distinction:

Fair Use (Purchased Books)

Anthropic legally purchased physical books and digitized them for training. The court found this transformative enough to qualify as fair use because:

  • The books were legally acquired
  • The use was transformative (training, not reproduction)
  • The AI does not reproduce the books verbatim

NOT Fair Use (The Pile)

Using copyrighted works from The Pile dataset was ruled NOT fair use because:

  • The works were not licensed or purchased
  • The dataset creators had not obtained permission
  • Some works were included despite being removed from The Pile after copyright concerns were raised

What This Means

For Authors and Creators

  • Your copyrighted works have value in the AI training context
  • Unlicensed use of your work for AI training may constitute infringement
  • Class action lawsuits can result in significant compensation
  • But settlement terms matter — courts will protect author interests

For AI Companies

  • Simply using open-source datasets does not guarantee legal safety
  • Due diligence on training data provenance is essential
  • Licensing copyrighted content is becoming a business necessity
  • Settlement costs can be enormous — prevention is cheaper

For the Industry

  • This case establishes that AI training on unlicensed copyrighted works carries real financial risk
  • The $1.5 billion figure (even if rejected) signals the scale of potential liability
  • AI companies are increasingly pursuing licensing deals proactively

Current Status

As of early 2026, the case remains unresolved. Anthropic and the authors are negotiating revised settlement terms that address Judge Alsup's concerns. A new settlement proposal is expected in 2026.

Comparison to Other AI Copyright Settlements

| Case | Amount | Per Creator | Status |

|---|---|---|---|

| Anthropic v. Authors | $1.5B | ~$3,000 | Rejected, renegotiating |

| Getty v. Stability AI | TBD | TBD | Active |

| NYT v. OpenAI | TBD | TBD | Active |

The Anthropic case, even in its rejected form, sets a benchmark for what AI companies may need to pay for unauthorized use of copyrighted training data.


This article is for informational purposes only and does not constitute legal advice. Last updated: April 2026

Related Articles

Regulation

The Great American AI Act: What the Obernolte-Trahan Draft Bill Means for Copyright, Innovation, and You

Reps. Jay Obernolte and Lori Trahan have released a 269-page bipartisan draft bill that would create...

Analysis

When Your Character Gets an AI Makeover: The BuzzFeed Cuppy Controversy and What It Means for Creator Rights

BuzzFeed greenlit an AI-generated Cuppy series through Amazon's Project Nara. Original creator Loryn...

News

CNN Sues Perplexity AI: Copyright and Trademark Claims Target AI 'Answer Engine'

CNN filed a 54-page complaint against Perplexity AI on May 28, 2026, alleging copyright and trademar...

Guide

AI Copyright Infringement Penalties in 2026: Fines, Damages & Consequences

What fines and damages can AI companies actually face for copyright infringement in 2026? A deep div...

Guide

Who Owns AI-Generated Code? Copyright, GitHub Copilot & the 2026 Legal Landscape

Can you copyright AI-generated code? What the GitHub Copilot lawsuit, US Copyright Office, and globa...