Deepfake X-rays are so real even doctors can’t tell the difference
Summary
A study published on March 24 in *Radiology* reveals that both human radiologists and multimodal large language models (LLMs) struggle to differentiate real X-rays from AI-generated "deepfake" images. The research, involving 17 radiologists from 12 institutions across six countries, tested 264 X-ray images, split between real scans and those generated by ChatGPT or RoentGen. When unaware of synthetic images, radiologists identified only 41% of deepfakes; awareness increased accuracy to 75%. Four LLMs (GPT-4o, GPT-5, Gemini 2.5 Pro, Llama 4 Maverick) achieved 52% to 89% accuracy. The study found no correlation between a radiologist's experience and detection ability, though musculoskeletal radiologists performed better. Researchers identified visual cues like overly smooth bones or unnaturally straight spines in deepfakes, emphasizing risks like fraudulent litigation and cybersecurity threats, and recommending safeguards such as invisible watermarks and cryptographic signatures.
Key takeaway
For healthcare executives and IT security leaders evaluating digital imaging infrastructure, this study highlights a critical vulnerability. The demonstrated ability of deepfake X-rays to fool experts necessitates immediate action to implement robust digital protections like invisible watermarks and cryptographic signatures. Your organization should prioritize integrating these authentication methods to safeguard against fraudulent litigation and potential cybersecurity attacks that could manipulate patient diagnoses or corrupt medical records.
Key insights
Deepfake X-rays generated by AI can deceive both expert radiologists and advanced multimodal LLMs, posing significant medical and cybersecurity risks.
Principles
- AI-generated medical images can appear radiographically plausible.
- Awareness improves human deepfake detection accuracy.
- Experience does not guarantee deepfake detection.
Method
The study involved radiologists and LLMs evaluating real and AI-generated X-rays (ChatGPT, RoentGen) in blinded and unblinded settings to assess detection accuracy and identify visual cues.
In practice
- Implement invisible watermarks for image authenticity.
- Use cryptographic signatures for image capture verification.
- Train professionals to recognize deepfake visual patterns.
Topics
- Deepfake Medical Imaging
- Radiologist Detection
- Multimodal LLMs
- Medical Imaging Security
- Generative AI Models
Best for: CTO, VP of Engineering/Data, Director of AI/ML, Research Scientist, AI Security Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Robotics Research News -- ScienceDaily.