Britannica and Merriam-Webster: OpenAI is turning the web’s best reference content into an answer engine that (1) copies, (2) competes, and (3) sometimes lies...

2025-11-28 · Source: Pascal’s Substack · Field: Legal & Regulatory — Intellectual Property & Patents, Litigation & Dispute Resolution, Compliance & Risk Management · Depth: Advanced, medium

Summary

Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging both copyright infringement and trademark violations. The complaint asserts that OpenAI copied their copyrighted content for training GPT models and for retrieval-augmented generation (RAG) at runtime, leading to verbatim or near-verbatim outputs and derivative works. Furthermore, the plaintiffs claim OpenAI's ChatGPT generates made-up material attributed to their brands, uses partial reproductions without disclosing omissions, and cannibalizes their traffic and revenue. They emphasize OpenAI's alleged willful disregard for licensing, despite a known market. The lawsuit presents strong evidence through concrete output exhibits and registered copyrights, while relying on inference for training corpus provenance. The case is strategically built to argue substitution harm and weaponize trademark claims related to hallucinations and omissions.

Key takeaway

For CTOs and VPs of Engineering developing AI products, your teams must prioritize robust anti-memorization safeguards and transparent UI design. Be aware that using publisher brands alongside AI-generated content, especially with hallucinations or omissions, creates significant trademark liability. Proactive licensing strategies and meticulous compliance with terms-of-use and robots.txt are now critical components of your litigation defense strategy.

Key insights

The lawsuit against OpenAI combines copyright and trademark claims, focusing on content copying, brand confusion, and market substitution.

Principles

Verbatim outputs strengthen copyright infringement claims.
Trademark claims are potent when AI hallucinates using trusted brands.
Market harm narratives resonate with courts.

Method

Plaintiffs should present side-by-side output comparisons, plead multiple infringement moments (training, retrieval, output), and frame AI as a market substitute to strengthen legal arguments.

In practice

Document anti-memorization testing and output filters.
Design RAG for minimal retention and clear permissions.
Use clear UI cues for partial or uncertain AI answers.

Topics

AI Copyright Litigation
Large Language Models
Retrieval-Augmented Generation
Trademark Infringement
Content Licensing

Best for: CTO, VP of Engineering/Data, Director of AI/ML, Legal Professional, AI Engineer, AI Product Manager

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Pascal’s Substack.