RobotValues: Evaluating Household Robots When Human Values Conflict

· Source: Artificial Intelligence · Field: Technology & Digital — Robotics & Autonomous Systems, Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

RobotValues is a new benchmark designed to evaluate household robot planners in 10,000 value-conflict scenarios. Each scenario features a realistic household image and multiple plausible robot actions that prioritize different human values, such as autonomy, efficiency, or social appropriateness, beyond mere task success. The benchmark was constructed using LLM-assisted scenario generation, stakeholder-grounded value extraction, image generation, and automatic quality control. Evaluations using RobotValues reveal that Vision-Language Models (VLMs) commonly used in robotics exhibit default value preferences, including safety and accommodation, while consistently underselecting privacy-prioritizing actions. Furthermore, when instructed to prioritize specific conflicting values, these models frequently fail to override their default actions, choosing incorrect actions 80% of the time. These findings underscore the necessity for household robot evaluation to encompass value-based decision-making, not just task completion or safety compliance.

Key takeaway

For AI Scientists and Robotics Engineers developing household robots, you must integrate value-conflict resolution into your planning and evaluation frameworks. Your current Vision-Language Models likely possess default value biases, such as prioritizing safety over privacy, and struggle to follow explicit value instructions. Focus on developing robust mechanisms that enable robots to reliably prioritize instructed human values, even when they conflict with inherent model preferences, to ensure socially appropriate and user-aligned robot behavior.

Key insights

Household robots must navigate value conflicts, but current VLMs struggle to prioritize instructed values over their defaults.

Principles

Method

RobotValues was constructed via LLM-assisted scenario generation, stakeholder-grounded value extraction, image generation, and automatic quality control.

In practice

Topics

Best for: Research Scientist, Robotics Engineer, AI Scientist, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.