Stability AI launches Stable Audio 3.0 with up to six-minute tracks and open weights

2026-05-20 · Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, short

Summary

Stability AI has launched Stable Audio 3.0, a new generation of audio models, with three of its four variants available as open-weights. These models generate music tracks up to six minutes long and were trained entirely on licensed data. The family includes Stable Audio 3.0 Small SFX and Small (459 million parameters, 2-minute tracks, 0.44 seconds inference on an H200 GPU), and Medium (1.4 billion parameters, 6:20 minute tracks, 1.31 seconds inference). The largest model, Stable Audio 3.0 Large (2.7 billion parameters), is exclusive to API users and enterprise customers. The new architecture features a semantic-acoustic autoencoder for flexible output and inpainting capabilities. Stability AI emphasizes its licensed training data and offers legal indemnification for enterprise users, positioning itself against competitors facing copyright lawsuits, such as those involving OpenAI, Suno, and Udio. Commercial use is free for organizations with under \$1 million in annual revenue.

Key takeaway

For AI Engineers or Music Producers evaluating generative audio tools, Stable Audio 3.0 offers a compelling, legally safer option. You can utilize its open-weight models for commercial projects up to \$1 million in revenue without licensing fees. Consider its licensed training data and enterprise indemnification as a critical differentiator, mitigating copyright infringement risks prevalent with other platforms. Explore the LoRA fine-tuning capabilities to customize models with your own audio libraries.

Key insights

Stability AI's new audio models prioritize licensed data and legal indemnification to mitigate copyright risks.

Principles

Licensed data reduces legal exposure.
Open weights foster broader adoption.
Enterprise indemnification adds value.

Method

The new semantic-acoustic autoencoder architecture enables variable-length audio generation with second-level control and inpainting features for editing and extending tracks.

In practice

Fine-tune models using LoRA documentation.
Generate sound effects on mobile devices.
Edit specific segments of generated audio.

Topics

Stable Audio 3.0
Generative Audio
Open Weights
Copyright Compliance
AI Music Generation
Enterprise AI Licensing

Best for: CTO, Machine Learning Engineer, AI Product Manager, AI Engineer, Director of AI/ML, Legal Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.