How Big AI Developers are Skirting a Mandate for Training Data Transparency

2026-03-04 · Source: Tech Policy Press · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, AI Governance & Ethics · Depth: Intermediate, short

Summary

The EU AI Act includes a crucial, yet often neglected, provision mandating developers of "general-purpose AI" models to publish a summary of their training data, a measure vital for copyright holders, privacy watchdogs, and researchers. New peer-reviewed research, supported by Mozilla, developed a framework to assess these summaries and found that open-source developers like Hugging Face (SmolLM) and Swiss AI (Apertus) are successfully meeting or exceeding transparency standards, demonstrating the feasibility of compliance. However, the most concerning finding is that leading AI developers, including OpenAI, Google, and xAI, have failed to publish any such summaries, exploiting a legal gray area due to delayed enforcement powers. This non-compliance by industry behemoths underscores the urgent need for the EU AI Office to prepare for rigorous enforcement to ensure transparency and prevent smaller, compliant developers from being disadvantaged.

Key takeaway

Major AI developers like OpenAI and Google are currently failing to comply with the EU AI Act's mandate for publishing training data summaries, despite open-source models demonstrating feasibility. New research shows smaller projects like Swiss AI's Apertus achieve high transparency scores using an assessment framework, while leading developers exploit an enforcement gap. This non-compliance hinders critical oversight for copyright, privacy, and research, demanding urgent enforcement from the EU AI Office for accountability and fair competition.

Topics

AI Training Data Transparency
EU AI Act
General-Purpose AI Models
Regulatory Compliance
Open-Source AI

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Researcher, AI Engineer, Policy Maker

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Tech Policy Press.