Should AI Be Open Source?

2025-08-21 · Source: Jordan Harrod · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, long

Summary

OpenAI recently released its first open-source models in five years, the GBT OSS series, following similar moves by Meta with its Llama series (February 2023) and Google with Gemma (February 2024). This shift, acknowledged by Sam Altman in January 2025 as correcting a historical misstep, marks a departure from OpenAI's previous proprietary model releases like GPT-3 and GPT-4. The change was likely influenced by Deepseek's open-source reasoning model, released in December 2024, which outperformed ChatGPT on iOS. However, the article highlights a critical distinction: "open source" in AI often refers only to open weights, not the full code, data, or training methodology, unlike traditional open-source software. This raises concerns about data attribution, copyright, and who truly benefits from the "democratization" of AI, especially given the reliance of major AI companies and 73% of Fortune 500 companies on underlying open-source machine learning libraries.

Key takeaway

For Directors of AI/ML evaluating "open source" AI models, recognize that this term often refers only to open weights, not the full training data or code. This distinction is crucial for understanding intellectual property risks, data provenance, and the true extent of model transparency. You should scrutinize the specific components made available to assess compliance, potential liabilities, and the ability to truly audit or modify the underlying system, rather than assuming traditional open-source freedoms.

Key insights

AI's "open source" often means open weights, not full transparency, complicating traditional open-source principles.

Principles

Traditional open source requires code access for modification and redistribution.
AI "open source" typically provides only model weights, not training data or methodology.
AI models built on public data raise significant copyright and attribution challenges.

In practice

Understand the distinction between open-source code and open-source AI model weights.
Evaluate AI models for true transparency beyond just weight availability.

Topics

OpenAI GBT OSS
Open-Source AI Definition
AI Model Weights
Copyright Challenges
Data Privacy

Best for: AI Ethicist, Legal Professional, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Jordan Harrod.