Britannica and Merriam-Webster: OpenAI is turning the web’s best reference content into an answer engine that (1) copies, (2) competes, and (3) sometimes lies...
Summary
Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging both copyright infringement and trademark violations. The complaint asserts that OpenAI copied their copyrighted content for training GPT models and for retrieval-augmented generation (RAG) at runtime, leading to verbatim or near-verbatim outputs and derivative works. Furthermore, the plaintiffs claim OpenAI's ChatGPT generates made-up material attributed to their brands, uses partial reproductions without disclosing omissions, and cannibalizes their traffic and revenue. They emphasize OpenAI's alleged willful disregard for licensing, despite a known market. The lawsuit presents strong evidence through concrete output exhibits and registered copyrights, while relying on inference for training corpus provenance. The case is strategically built to argue substitution harm and weaponize trademark claims related to hallucinations and omissions.
Key takeaway
For CTOs and VPs of Engineering developing AI products, your teams must prioritize robust anti-memorization safeguards and transparent UI design. Be aware that using publisher brands alongside AI-generated content, especially with hallucinations or omissions, creates significant trademark liability. Proactive licensing strategies and meticulous compliance with terms-of-use and robots.txt are now critical components of your litigation defense strategy.
Key insights
The lawsuit against OpenAI combines copyright and trademark claims, focusing on content copying, brand confusion, and market substitution.
Principles
- Verbatim outputs strengthen copyright infringement claims.
- Trademark claims are potent when AI hallucinates using trusted brands.
- Market harm narratives resonate with courts.
Method
Plaintiffs should present side-by-side output comparisons, plead multiple infringement moments (training, retrieval, output), and frame AI as a market substitute to strengthen legal arguments.
In practice
- Document anti-memorization testing and output filters.
- Design RAG for minimal retention and clear permissions.
- Use clear UI cues for partial or uncertain AI answers.
Topics
- AI Copyright Litigation
- Large Language Models
- Retrieval-Augmented Generation
- Trademark Infringement
- Content Licensing
Best for: CTO, VP of Engineering/Data, Director of AI/ML, Legal Professional, AI Engineer, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Pascal’s Substack.