MiniMax M3 Just Killed Closed-Source Models

2026-06-04 · Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation, Software Development & Engineering · Depth: Advanced, quick

Summary

MiniMax shipped its M3 model on June 1, 2026, positioning it as a frontier-class, open-weights coding model with a 1M-token context window. Notably, M3 is priced at approximately 5–10% of rivals like GPT-5.5 and Gemini 3.1 Pro. This cost efficiency stems from a novel approach to attention called MiniMax Sparse Attention (MSA), which challenges the conventional need to process the entire past context for next-token prediction in long sequences. The author notes that most benchmarks are MiniMax's own, and while the model's weights are not yet available for independent verification, the analysis focuses on the mechanism and reported numbers.

Key takeaway

For Machine Learning Engineers evaluating long-context models, MiniMax M3's introduction signals a significant shift towards more cost-effective open-weights solutions. You should investigate sparse attention architectures for your own model development, especially when aiming for large context windows without prohibitive inference costs. Consider M3 as a benchmark for future open-source coding models, and prepare to test its performance once weights become available.

Key insights

MiniMax M3 demonstrates sparse attention enables cost-effective, long-context, open-weights coding models.

Principles

Long context doesn't require full past attention.
Sparse attention improves inference speed and cost.

In practice

Achieve frontier coding at 5-10% cost.
Utilize 1M-token context windows.
Explore open-weights alternatives.

Topics

MiniMax M3
Sparse Attention
Open-weights Models
Coding Models
Long Context
AI Inference Cost

Best for: CTO, VP of Engineering/Data, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.