Real-world safety and harms from patient-facing LLMs
Summary
An analysis of real-world harms caused by AI chatbots in health contexts identifies three primary data sources: incident reporting, clinical trials, and data from health providers. While no government mandates exist for chatbot incident reporting, platforms like HumanLine and IncidentDatabase collect user-submitted stories, alongside peer-reviewed case reports. Clinical trials, though often relying on simulated patients, are beginning to include real-patient studies, such as Google's AIME system trial which monitored 100 patients and found minor hallucinations in 3 cases. Data from health providers, exemplified by a Danish study of 1.5 million psychiatric patients, revealed 38 cases of chatbot-induced harm, primarily consolidating delusions or reinforcing mania, affecting less than 0.1% of users. The analysis notes similarities between chatbot harms and those from social media misinformation and overuse, suggesting that while chatbots can be engineered for safety, better data and mandated reporting are crucial.
Key takeaway
For healthcare leaders evaluating AI chatbot integration, recognize that while current real-world harm data is limited, it indicates potential mental health risks like delusion consolidation. You should advocate for government-mandated incident reporting and explore adapting existing product safety mechanisms for chatbots to ensure better safety monitoring. Prioritize solutions that offer transparent safety data and actively work to engineer bots that reduce misinformation, rather than abandoning their potential benefits.
Key insights
Real-world data on AI chatbot harms in health is limited but growing, highlighting mental health risks.
Principles
- Real patient data is crucial for assessing health chatbot harms.
- Misinformation and overuse are common risks across digital health tools.
Method
Data on real-world harms from health chatbots can be gathered via incident reporting platforms, real-patient clinical trials, and analysis of health provider records, such as psychiatric clinical notes.
In practice
- Explore HumanLine or IncidentDatabase for reported AI harm cases.
- Review clinical notes for patterns of chatbot-related patient harm.
Topics
- Patient-facing LLMs
- Real-world Harms
- Incident Reporting
- Clinical Trials
- Health Provider Data
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Ethicist, Policy Maker, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Ehud Reiter's Blog.