Kleiner v. Adobe is another step in a pattern: the legal system is increasingly treating “training data governance” as a compliance domain, not a research footnote.
Summary
The "Kleiner v. Adobe SlimLM" lawsuit, filed February 9, 2026, in the Northern District of California, is a proposed class action alleging Adobe Inc. trained its SlimLM small language models on large-scale, unlicensed copies of copyrighted books, including author Arthur Kleiner's registered work. The complaint asserts a "dataset supply chain" infringement theory, claiming SlimLM was trained on SlimPajama-627B, a dataset derived from RedPajama, which allegedly incorporated "Books3"—a corpus associated with pirated books from shadow libraries. Kleiner's suit alleges direct copyright infringement under 17 U.S.C. § 501, seeking damages, attorneys' fees, injunctive relief, and destruction of infringing copies under 17 U.S.C. § 503(b). The case also highlights Adobe's "ethical AI" marketing against its alleged reliance on known unlicensed datasets.
Key takeaway
For CTOs and VPs of Engineering integrating language models, this lawsuit signals that you cannot outsource legal risk to open dataset supply chains. Your teams must implement robust data governance and provenance tracking for all training data, especially for commercialized models. Proactively verifying the licensing and origin of datasets like Books3 is crucial to mitigate significant copyright infringement liability and reputational damage.
Key insights
AI training data governance is evolving into a critical legal compliance domain, not merely a research consideration.
Principles
- Dataset supply chain liability is emerging.
- Fair use defense is fact-intensive.
- Piracy-tainted sources increase litigation risk.
In practice
- Audit dataset provenance rigorously.
- Document internal data governance.
- Filter out known pirated content.
Topics
- AI Training Data
- Copyright Infringement
- Fair Use
- Data Governance
- Small Language Models
Best for: CTO, VP of Engineering/Data, Executive, Legal Professional, AI Ethicist, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Pascal’s Substack.