Microsoft built supercomputer to help OpenAI infringe copyrights, NYT alleged
Summary
The New York Times (NYT) has proposed amending its copyright complaint against OpenAI and Microsoft, alleging Microsoft actively encouraged infringement by building a bespoke supercomputing system. This move follows a Supreme Court ruling that established a new standard for contributory infringement, requiring plaintiffs to prove intentional inducement. The NYT seeks to align its claim against Microsoft with this precedent, specifying the supercomputer was tailor-made to train AI on copyrighted works, disproportionately weighting NYT articles. The original 2023 lawsuit claimed ChatGPT illegally trained on NYT content, produced verbatim outputs, caused market harms by substituting subscriptions and losing affiliate commissions, and resulted in reputational damage from false attributions. The NYT now alleges Microsoft's deployment of "Times-trained LLMs" boosted its market capitalization by a trillion dollars. Evidence includes ChatGPT outputs that skirt paywalls and generate near-verbatim excerpts, alongside hallucinations falsely citing NYT content. OpenAI maintains its training constitutes fair use, but the NYT's focus on market harms could challenge this defense, potentially leading to model wipes and extensive damages.
Key takeaway
For Directors of AI/ML or AI Product Managers building and deploying large language models, this amended complaint signals heightened legal scrutiny on training data provenance and model output behavior. You must meticulously document your data sourcing, especially for proprietary content, and rigorously audit model outputs for verbatim reproduction or content that could substitute for original works. Proving intentional inducement for copyright infringement is now a key legal standard, making transparent development practices and robust IP compliance critical to mitigate significant legal and financial risks.
Key insights
New legal precedent requires plaintiffs to prove intentional inducement for contributory copyright infringement in AI training cases.
Principles
- Contributory infringement requires proving intent.
- AI training on copyrighted works faces market harm scrutiny.
- Tailor-made systems can imply intent to infringe.
In practice
- Document AI training data sourcing.
- Analyze model outputs for verbatim content.
- Assess AI's market substitution potential.
Topics
- Copyright Infringement
- AI Training Data
- Large Language Models
- Contributory Infringement
- OpenAI Litigation
- Microsoft Supercomputer
Best for: CTO, Executive, VP of Engineering/Data, Legal Professional, Director of AI/ML, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.