Boston Dynamics’ robot dog now reads gauges and thermometers with Google's AI
Summary
Boston Dynamics' Spot robot can now accurately read analog instruments like gauges and thermometers in industrial facilities, thanks to Google DeepMind's new Gemini Robotics-ER 1.6 AI model, announced April 14. This model enhances "embodied reasoning" for robots interacting with physical environments, acting as a high-level reasoning model for task planning and execution. It significantly boosts instrument reading accuracy from 23% (Gemini Robotics-ER 1.5) to 98% by incorporating "agentic vision," a feature from Google's Gemini 3.0 Flash model. The Gemini Robotics-ER 1.6 model also improves multi-view reasoning, reduces "hallucination" problems in object identification, and is described as Google's "safest robotics model yet," with enhanced adherence to physical safety constraints and better perception of human injury risk.
Key takeaway
For Computer Vision Engineers developing autonomous inspection systems, the Gemini Robotics-ER 1.6 model's 98% accuracy in reading analog instruments and improved safety features suggest a significant leap in deployable robotic capabilities. You should evaluate integrating agentic vision models to enhance visual reasoning and reduce errors in complex industrial environments, particularly where human safety is a critical concern.
Key insights
Google DeepMind's Gemini Robotics-ER 1.6 model significantly enhances robot perception and safety for complex industrial inspections.
Principles
- Agentic vision improves visual reasoning.
- Multi-view reasoning enhances environmental understanding.
- Safety constraints are critical for robot deployment.
Method
The Gemini Robotics-ER 1.6 model uses "agentic vision" to combine visual reasoning with code execution, creating a "visual scratchpad" for inspecting and manipulating images, and employs a pointing process for complex visual tasks.
In practice
- Deploy robots for analog instrument reading.
- Utilize multi-camera systems for environmental context.
- Prioritize AI models with advanced safety features.
Topics
- Gemini Robotics-ER 1.6
- Boston Dynamics Spot
- Industrial Inspection
- Agentic Vision
- Embodied Reasoning
Best for: Computer Vision Engineer, Research Scientist, Robotics Engineer, AI Engineer, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.