LLMs help robots understand vague instructions and focus on key details
Summary
MIT CSAIL has developed "Masked Inverse Reinforcement Learning" (Masked IRL), a novel approach published on June 26, 2026, that uses two large language models to enhance robot understanding of vague instructions and improve focus on critical details. The first LLM clarifies ambiguous user prompts, such as expanding "stay close" to "stay close to the surface of the table," based on kinesthetic demonstration data. A second LLM then evaluates environmental details, masking irrelevant elements (scoring them "0") and incorporating crucial ones ("1") into the robot's motion plan. This method significantly reduces the need for extensive training, requiring nearly five times less demonstration data. Masked IRL demonstrated up to a 15 percent improvement over comparable baselines in correctly identifying implicit user preferences, enabling robots to safely maneuver objects around obstacles in both simulated and real-world tasks, even with previously unseen prompts.
Key takeaway
For Robotics Engineers grappling with ambiguous user instructions or extensive data requirements, you should consider integrating dual LLM architectures like Masked IRL. This approach allows your autonomous systems to interpret vague commands and filter environmental noise, significantly reducing the need for extensive kinesthetic demonstrations. Your robots can learn complex tasks more efficiently and safely maneuver around unstated obstacles, improving performance by up to 15 percent in identifying implicit user preferences.
Key insights
Masked IRL uses dual LLMs to clarify vague robot instructions and filter environmental noise, reducing demonstration data needs.
Principles
- Ambiguous instructions require LLM-guided elaboration.
- Irrelevant environmental details can be masked for focus.
- Kinesthetic demonstrations are effective for robot training.
Method
Masked IRL uses one LLM to elaborate on kinesthetic demonstration trajectories and vague prompts, then a second LLM to mask irrelevant environmental details, scoring them 1 (important) or 0 (irrelevant) for motion planning.
In practice
- Train robots with kinesthetic demonstrations.
- Use LLMs to clarify implicit user preferences.
- Filter environmental data for task-relevant information.
Topics
- Masked Inverse Reinforcement Learning
- Large Language Models
- Robot Instruction
- Kinesthetic Demonstration
- Motion Planning
- Human-Robot Interaction
Best for: Research Scientist, AI Scientist, Robotics Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by MIT News - Artificial intelligence.