Can coding agents relicense open source through a “clean room” implementation of code?

· Source: Simon Willison's Weblog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Legal & Regulatory · Depth: Advanced, medium

Summary

The `chardet` Python library, originally LGPL-licensed since 2006, recently released version 7.0.0 under an MIT license, claiming a "ground-up rewrite." This move sparked a legal and ethical debate, with original author Mark Pilgrim asserting a violation of the LGPL, arguing that extensive exposure to the original code by the maintainer, Dan Blanchard, precludes a "clean room" implementation. Blanchard, who has maintained `chardet` since 2012, contends the new code is structurally independent, citing JPlag tool results showing less than 1.3% similarity to previous versions. He details a process using Claude Code, starting in an empty repository with explicit instructions to avoid LGPL/GPL-licensed code, to generate the new version. This case highlights complex questions regarding AI-assisted code rewrites, derivative works, and open-source relicensing.

Key takeaway

For engineering leaders evaluating AI-assisted code modernization or relicensing efforts, your teams must meticulously document the rewrite process, including AI prompts and isolation measures. Relying solely on AI for "clean room" implementations without strict human oversight and verifiable non-derivation metrics, such as low JPlag similarity scores, introduces significant legal and ethical risks, particularly concerning existing open-source licenses. Be prepared for potential litigation as commercial entities confront similar IP challenges.

Key insights

AI-assisted code rewrites challenge traditional "clean room" definitions and open-source licensing compliance.

Principles

Method

A design document is created, then an AI agent generates code in an isolated environment, explicitly instructed to avoid specific licenses, followed by human review and iteration.

In practice

Topics

Code references

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Software Engineer, Legal Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.