New MiniMax M2.7 proprietary AI model is 'self-evolving' and can perform 30-50% of reinforcement learning research workflow

2026-03-18 · Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Advanced, medium

Summary

MiniMax, a Chinese AI startup, has released its new proprietary large language model (LLM), M2.7, which features "self-evolving" capabilities. This model can autonomously manage 30-50% of its own reinforcement learning research workflow, including data pipelines, training environments, and evaluation infrastructure, by analyzing failure trajectories and planning code modifications over iterative loops. M2.7 is designed for AI agents and third-party tools, offering intelligence comparable to Google's Gemini 3.1 and Anthropic's Claude Opus 4.6 in autonomous research tasks, achieving a 66.6% medal rate on MLE Bench Lite. It also demonstrates significant performance gains over its predecessor, M2.5, in software engineering (56.22% on SWE-Pro), professional office tasks (Elo score 1495 on GDPval-AA), and hallucination reduction (34% rate). The model is available via API and third-party providers like OpenRouter, priced at $0.30 per 1 million input tokens and $1.20 per 1 million output tokens, making it one of the most affordable frontier AI models globally.

Key takeaway

For CTOs and VP of Engineering evaluating AI investments, MiniMax M2.7 signals a shift towards self-evolving models that can significantly reduce operational costs and accelerate development cycles. Your teams should consider integrating M2.7 for its demonstrated ability to autonomously manage research workflows and its cost-efficiency for high-level reasoning, especially in software engineering and professional office tasks. However, be mindful of its proprietary nature and Chinese origin, which may pose compliance considerations for highly regulated or government-facing industries in the U.S. and the West.

Key insights

MiniMax M2.7 introduces self-evolving AI capabilities, autonomously managing significant portions of its own development workflow.

Principles

AI models can be architects of their own progress.
Recursive self-improvement drives faster iteration curves.

Method

Earlier M2.7 versions built a research agent harness to manage data, training, and evaluation, autonomously triggering log-reading, debugging, and metric analysis to optimize programming performance.

In practice

Integrate M2.7 via API for cost-efficient, high-level reasoning.
Utilize M2.7 for professional document workflows and financial modeling.
Explore M2.7's agentic capabilities for SRE and DevOps automation.

Topics

Self-Evolving AI
Large Language Models
AI Agents
Reinforcement Learning
Model Benchmarking

Code references

openai/mle-bench

Best for: CTO, VP of Engineering/Data, Machine Learning Engineer, AI Engineer, MLOps Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.