Your doctor’s AI notetaker may be making things up, Ontario audit finds

· Source: AI - Ars Technica · Field: Health & Wellbeing — Healthcare Systems & Policy, Clinical Care & Medical Practice, Medical Devices & Health Technology · Depth: Fundamental Awareness, quick

Summary

An audit by the auditor general of Ontario revealed that AI medical scribes, recommended by the provincial government, consistently generated incorrect, incomplete, and hallucinated information, potentially leading to "inadequate or harmful treatment plans" for patients. The review of 20 approved vendors showed all had accuracy or completeness issues, with specific examples including hallucinated referrals and incorrect medication transcriptions. A significant flaw in the vendor evaluation rubric was identified, where the "accuracy of medical notes" metric accounted for only 4% of a vendor's overall score, enabling the approval of highly inaccurate systems. Consequently, the auditor general recommended more rigorous testing for AI scribe systems and mandating doctors to confirm their review of generated notes before committing them to patient logs.

Key takeaway

An Ontario audit reveals government-approved AI medical scribes frequently generate incorrect, incomplete, and hallucinated patient information. All 20 audited vendors showed accuracy issues, with 9 hallucinating data and 12 recording it incorrectly, despite accuracy being only 4% of their approval criteria. This poses a critical risk of inadequate or harmful treatment plans, underscoring the need for rigorous testing and mandatory doctor review of AI-generated notes.

Topics

Best for: CTO, AI Product Manager, VP of Engineering/Data, Domain Expert, Policy Maker, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.