Tongyi DeepResearch Technical Report

2024-08-06 · Source: cs.MA updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, long

Summary

Tongyi DeepResearch, an agentic large language model developed by Alibaba Group's Tongyi Lab, is specifically designed for long-horizon, deep information-seeking research tasks. This 30.5 billion parameter model, which activates only 3.3 billion parameters per token, achieves state-of-the-art performance across various agentic deep research benchmarks, including Humanity’s Last Exam (32.9), BrowseComp (43.4), and xbench-DeepSearch (75.0). Its development utilizes an end-to-end training framework combining agentic mid-training and post-training, supported by a scalable, fully automatic data synthesis pipeline that eliminates costly human annotation. The system also employs customized environments for stable and consistent interactions throughout training. Tongyi DeepResearch is open-sourced, including its model, framework, and complete solutions, aiming to empower the community in advancing AI research capabilities.

Key takeaway

For research scientists and NLP engineers developing advanced AI agents, Tongyi DeepResearch offers a robust, open-source framework. Its novel mid-training and synthetic data approach demonstrates a path to scalable, high-performing deep research agents. You should explore its open-sourced model and framework to accelerate your own agentic system development, particularly for complex, multi-step information-seeking tasks.

Key insights

Tongyi DeepResearch is an open-source agentic LLM for deep research, trained with synthetic data and specialized environments.

Principles

Integrate mid-training for agentic inductive bias.
Prioritize synthetic data for scalable agent training.
Design environments coupled with training process.

Method

Tongyi DeepResearch uses an end-to-end training framework with agentic mid-training and post-training. It employs a fully automated data synthesis pipeline and stage-specific environments (Prior World, Simulated, Real-world) to cultivate autonomous research capabilities.

In practice

Utilize Search, Visit, Python Interpreter tools.
Integrate Google Scholar for academic retrieval.
Parse various file types with File Parser tool.

Topics

Agentic Large Language Models
End-to-End Agent Training
Synthetic Data Generation
Deep Research Benchmarks
Autonomous Information Seeking

Code references

Alibaba-NLP/DeepResearch

Best for: Research Scientist, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.