Scientists invented a fake disease. AI told people it was real

· Source: Machine learning : nature.com subject feeds · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, medium

Summary

A medical researcher, Almira Osmanovic Thunström, and her team at the University of Gothenburg, Sweden, invented a non-existent skin condition called "bixonimania" and uploaded two fake studies about it to a preprint server in early 2024. This experiment aimed to test whether large language models (LLMs) would disseminate misinformation. Within weeks, major AI systems like Microsoft Copilot, Google Gemini, Perplexity AI, and OpenAI's ChatGPT began repeating the invented condition as real, advising users on symptoms and treatments. More concerningly, these fake papers were subsequently cited in peer-reviewed literature, including a study in *Cureus* (later retracted), suggesting some researchers rely on AI-generated references without verification. The experiment highlighted LLMs' susceptibility to professional-looking misinformation and the broader challenge of maintaining scientific integrity in the age of AI.

Key takeaway

For CTOs and VPs of Engineering overseeing AI development, this experiment underscores the critical need for advanced misinformation detection and content validation within LLMs. Your teams should prioritize developing and integrating robust fact-checking mechanisms, especially for sensitive domains like health, to prevent the propagation of fabricated information and safeguard user trust. Implement rigorous testing protocols that include adversarial examples to stress-test model reliability.

Key insights

LLMs readily propagate fabricated medical conditions presented in professional-looking academic formats, impacting both public advice and scientific literature.

Principles

Method

Researchers fabricated a medical condition, "bixonimania," created fake preprints with deliberate red flags, and uploaded them to a preprint server to observe LLM and human citation behavior.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Ethicist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.