Stability AI launches Stable Audio 3.0 with up to six-minute tracks and open weights
Summary
Stability AI has launched Stable Audio 3.0, a new generation of audio models, with three of its four variants available as open-weights. These models generate music tracks up to six minutes long and were trained entirely on licensed data. The family includes Stable Audio 3.0 Small SFX and Small (459 million parameters, 2-minute tracks, 0.44 seconds inference on an H200 GPU), and Medium (1.4 billion parameters, 6:20 minute tracks, 1.31 seconds inference). The largest model, Stable Audio 3.0 Large (2.7 billion parameters), is exclusive to API users and enterprise customers. The new architecture features a semantic-acoustic autoencoder for flexible output and inpainting capabilities. Stability AI emphasizes its licensed training data and offers legal indemnification for enterprise users, positioning itself against competitors facing copyright lawsuits, such as those involving OpenAI, Suno, and Udio. Commercial use is free for organizations with under \$1 million in annual revenue.
Key takeaway
For AI Engineers or Music Producers evaluating generative audio tools, Stable Audio 3.0 offers a compelling, legally safer option. You can utilize its open-weight models for commercial projects up to \$1 million in revenue without licensing fees. Consider its licensed training data and enterprise indemnification as a critical differentiator, mitigating copyright infringement risks prevalent with other platforms. Explore the LoRA fine-tuning capabilities to customize models with your own audio libraries.
Key insights
Stability AI's new audio models prioritize licensed data and legal indemnification to mitigate copyright risks.
Principles
- Licensed data reduces legal exposure.
- Open weights foster broader adoption.
- Enterprise indemnification adds value.
Method
The new semantic-acoustic autoencoder architecture enables variable-length audio generation with second-level control and inpainting features for editing and extending tracks.
In practice
- Fine-tune models using LoRA documentation.
- Generate sound effects on mobile devices.
- Edit specific segments of generated audio.
Topics
- Stable Audio 3.0
- Generative Audio
- Open Weights
- Copyright Compliance
- AI Music Generation
- Enterprise AI Licensing
Best for: CTO, Machine Learning Engineer, AI Product Manager, AI Engineer, Director of AI/ML, Legal Professional
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.