Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control
Summary
A new study benchmarks the safety of 72 large language models (LLMs) when used as control components for robotic health attendants. Researchers introduced a dataset of 270 harmful instructions across nine categories, based on American Medical Association ethical principles, and evaluated LLMs in a simulation environment. The average violation rate across all models was 54.4%, with over half exceeding 50%. Proprietary models demonstrated significantly higher safety (median 23.7% violation rate) compared to open-weight models (median 72.8%). Model size and release date were key factors for open-weight model safety. Medical domain fine-tuning offered no significant safety improvement, and prompt-based defenses provided only a modest reduction in violation rates, indicating current LLMs are not yet safe for clinical deployment.
Key takeaway
For CTOs and VPs of Engineering evaluating LLMs for healthcare robotics, recognize that current models, especially open-weight ones, have unacceptably high safety violation rates (median 72.8%). You must prioritize rigorous, domain-specific safety benchmarking as a core development criterion, as neither medical fine-tuning nor basic prompt defenses significantly mitigate risks for clinical deployment.
Key insights
Current LLMs exhibit high safety violation rates as robotic health attendant controllers, precluding clinical deployment.
Principles
- Safety evaluation is a first-class criterion.
- Proprietary LLMs are safer than open-weight models.
Method
A dataset of 270 harmful instructions, grounded in AMA ethics, was used to evaluate 72 LLMs in a Robotic Health Attendant simulation environment.
In practice
- Prioritize safety evaluation for LLM-controlled robotics.
- Consider proprietary LLMs for safety-critical applications.
Topics
- Large Language Models
- Robotic Health Attendants
- LLM Safety Benchmarking
- Medical Ethics
- Harmful Instructions Dataset
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Security Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.