Mind DeepResearch Technical Report

2026-04-21 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Mind DeepResearch (MindDR) is an efficient multi-agent deep research framework that achieves leading performance with approximately 30B-parameter models. It utilizes a collaborative three-agent architecture, consisting of a Planning Agent, DeepSearch Agent, and Report Agent, alongside a four-stage agent-specialized training pipeline. This pipeline includes SFT cold-start, Search-RL, Report-RL, and preference alignment. MindDR demonstrates competitive performance, achieving 45.7% on BrowseComp-ZH, 42.8% on BrowseComp, 46.5% on WideSearch, 75.0% on xbench-DS, and 52.5% on DeepResearch Bench. It also introduces MindDR Bench, a new benchmark of 500 real-world Chinese queries, where MindDR achieves a score of 51.8. The system has been deployed as an online product in Li Auto.

Key takeaway

For NLP Engineers and Research Scientists developing deep research agents, MindDR's approach demonstrates that high performance is achievable with smaller models by strategically combining multi-agent architectures and a phased training curriculum. You should consider decomposing your agent's capabilities into specialized modules and applying targeted reinforcement learning stages, rather than monolithic end-to-end training, to optimize for both efficiency and accuracy.

Key insights

MindDR achieves state-of-the-art deep research with smaller models via specialized multi-agent architecture and staged training.

Principles

Decompose complex problems into specialized subtasks.
Tailor training phases to specific agent capabilities.
Combine synthetic and real-world data for robustness.

Method

MindDR employs a three-agent inference pipeline (Planning, DeepSearch, Report) and a four-stage training pipeline: SFT, Search-RL, Report-RL, and preference alignment, using step-level credit assignment and rubric-based rewards.

In practice

Use multi-agent systems for complex research tasks.
Implement staged RL training for diverse capabilities.
Develop custom benchmarks with real user queries.

Topics

Mind DeepResearch
Multi-Agent Framework
Multi-stage Training Pipeline
Deep Research Agents
Search Reinforcement Learning

Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.