Deepfake X-rays are so real even doctors can’t tell the difference

2026-03-26 · Source: Robotics Research News -- ScienceDaily · Field: Science & Research — Health & Medical Research, Artificial Intelligence & Machine Learning · Depth: Advanced, short

Summary

A study published on March 24 in *Radiology* reveals that both human radiologists and multimodal large language models (LLMs) struggle to differentiate real X-rays from AI-generated "deepfake" images. The research, involving 17 radiologists from 12 institutions across six countries, tested 264 X-ray images, split between real scans and those generated by ChatGPT or RoentGen. When unaware of synthetic images, radiologists identified only 41% of deepfakes; awareness increased accuracy to 75%. Four LLMs (GPT-4o, GPT-5, Gemini 2.5 Pro, Llama 4 Maverick) achieved 52% to 89% accuracy. The study found no correlation between a radiologist's experience and detection ability, though musculoskeletal radiologists performed better. Researchers identified visual cues like overly smooth bones or unnaturally straight spines in deepfakes, emphasizing risks like fraudulent litigation and cybersecurity threats, and recommending safeguards such as invisible watermarks and cryptographic signatures.

Key takeaway

For healthcare executives and IT security leaders evaluating digital imaging infrastructure, this study highlights a critical vulnerability. The demonstrated ability of deepfake X-rays to fool experts necessitates immediate action to implement robust digital protections like invisible watermarks and cryptographic signatures. Your organization should prioritize integrating these authentication methods to safeguard against fraudulent litigation and potential cybersecurity attacks that could manipulate patient diagnoses or corrupt medical records.

Key insights

Deepfake X-rays generated by AI can deceive both expert radiologists and advanced multimodal LLMs, posing significant medical and cybersecurity risks.

Principles

AI-generated medical images can appear radiographically plausible.
Awareness improves human deepfake detection accuracy.
Experience does not guarantee deepfake detection.

Method

The study involved radiologists and LLMs evaluating real and AI-generated X-rays (ChatGPT, RoentGen) in blinded and unblinded settings to assess detection accuracy and identify visual cues.

In practice

Implement invisible watermarks for image authenticity.
Use cryptographic signatures for image capture verification.
Train professionals to recognize deepfake visual patterns.

Topics

Deepfake Medical Imaging
Radiologist Detection
Multimodal LLMs
Medical Imaging Security
Generative AI Models

Best for: CTO, VP of Engineering/Data, Director of AI/ML, Research Scientist, AI Security Engineer, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Robotics Research News -- ScienceDaily.