YuanLab AI Releases Yuan 3.0 Ultra: A Flagship Multimodal MoE Foundation Model, Built for Stronger Intelligence and Unrivaled Efficiency
Summary
YuanLab AI has released Yuan 3.0 Ultra, a trillion-parameter open-source multimodal Mixture-of-Experts (MoE) foundation model. This model achieves a 33.3% reduction in total parameters, from 1.5T to 1T, and a 49% increase in pre-training efficiency. These improvements are attributed to its novel Layer-Adaptive Expert Pruning (LAEP) algorithm, which prunes underutilized experts during pre-training. An Expert Rearranging algorithm further minimizes device-level token variance, leading to a high computational throughput of 92.6 TFLOPS per GPU. Yuan 3.0 Ultra also incorporates a Reflection Inhibition Reward Mechanism (RIRM) to promote concise reasoning, achieving leading accuracy on enterprise benchmarks like Docmatix (67.4%), ChatRAG (68.2%), and SummEval (62.8%).
Key takeaway
For AI Architects evaluating large-scale foundation models, Yuan 3.0 Ultra presents a compelling option due to its significant parameter reduction and enhanced pre-training efficiency. Its novel expert pruning and rearranging algorithms, combined with the RIRM, offer a path to more performant and concise reasoning models. Consider its benchmark performance and open-source nature for your next multimodal MoE deployment.
Key insights
Yuan 3.0 Ultra is a trillion-parameter MoE model optimized for efficiency and concise reasoning.
Principles
- Prune underutilized experts for efficiency.
- Minimize token variance for throughput.
Method
The model uses Layer-Adaptive Expert Pruning (LAEP) and Expert Rearranging algorithms for efficiency, alongside a Reflection Inhibition Reward Mechanism (RIRM) for concise reasoning.
In practice
- Achieves 92.6 TFLOPS per GPU.
- Reduces parameters by 33.3%.
Topics
- Yuan 3.0 Ultra
- Mixture-of-Experts
- Model Efficiency
- Multimodal Foundation Model
- Expert Pruning
Code references
Best for: AI Architect, NLP Engineer, Computer Vision Engineer, AI Engineer, Machine Learning Engineer, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning ML & Generative AI News.