Stability AI launches Stable Audio 3.0 with up to six-minute tracks and open weights

· Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, short

Summary

Stability AI has launched Stable Audio 3.0, a new generation of audio models, with three of its four variants available as open-weights. These models generate music tracks up to six minutes long and were trained entirely on licensed data. The family includes Stable Audio 3.0 Small SFX and Small (459 million parameters, 2-minute tracks, 0.44 seconds inference on an H200 GPU), and Medium (1.4 billion parameters, 6:20 minute tracks, 1.31 seconds inference). The largest model, Stable Audio 3.0 Large (2.7 billion parameters), is exclusive to API users and enterprise customers. The new architecture features a semantic-acoustic autoencoder for flexible output and inpainting capabilities. Stability AI emphasizes its licensed training data and offers legal indemnification for enterprise users, positioning itself against competitors facing copyright lawsuits, such as those involving OpenAI, Suno, and Udio. Commercial use is free for organizations with under \$1 million in annual revenue.

Key takeaway

For AI Engineers or Music Producers evaluating generative audio tools, Stable Audio 3.0 offers a compelling, legally safer option. You can utilize its open-weight models for commercial projects up to \$1 million in revenue without licensing fees. Consider its licensed training data and enterprise indemnification as a critical differentiator, mitigating copyright infringement risks prevalent with other platforms. Explore the LoRA fine-tuning capabilities to customize models with your own audio libraries.

Key insights

Stability AI's new audio models prioritize licensed data and legal indemnification to mitigate copyright risks.

Principles

Method

The new semantic-acoustic autoencoder architecture enables variable-length audio generation with second-level control and inpainting features for editing and extending tracks.

In practice

Topics

Best for: CTO, Machine Learning Engineer, AI Product Manager, AI Engineer, Director of AI/ML, Legal Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.