Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

Embodied-R1.5 is introduced as a unified Embodied Foundation Model (EFM) designed to integrate comprehensive embodied reasoning, including cognition, task planning, correction, and pointing, within a single architecture for general physical intelligence. The model leverages three automated data construction pipelines to build a large-scale data system of over 15B tokens and employs a multi-task balanced RL recipe to manage heterogeneous task conflicts. It also features a Planner-Grounder-Corrector (PGC) closed-loop framework, enabling autonomous execution and self-correction for long-horizon tasks. With only 8B parameters, Embodied-R1.5 achieves state-of-the-art performance on 16 out of 24 embodied VLM benchmarks, outperforming models like Gemini-Robotics-ER-1.5 and GPT-5.4. It can be fine-tuned into a Visual Language Agent (VLA) with minimal data, surpassing $π_{0.5}$ across four popular manipulation benchmark suites. Extensive zero-shot real-robot experiments confirm its strong generalization in instruction following and complex manipulation tasks. The project open-sources model weights, datasets, training code, and EmbodiedEvalKit.

Key takeaway

For robotics engineers developing embodied AI systems, Embodied-R1.5 offers a powerful foundation for general physical intelligence. You should consider integrating this 8B-parameter EFM, especially given its state-of-the-art performance across 16 VLM benchmarks and strong real-robot generalization. Utilize the open-sourced model weights and EmbodiedEvalKit to accelerate your development and evaluation of long-horizon, self-correcting robotic tasks.

Key insights

Embodied-R1.5 unifies diverse embodied reasoning capabilities into an 8B-parameter EFM, achieving SOTA performance and real-world generalization.

Principles

Method

The Planner-Grounder-Corrector (PGC) framework enables autonomous execution and self-correction by integrating planning, grounding, and corrective actions in a closed loop.

In practice

Topics

Best for: Research Scientist, AI Scientist, Robotics Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.