Scientists invented a fake disease. AI told people it was real
Summary
A medical researcher, Almira Osmanovic Thunström, and her team at the University of Gothenburg, Sweden, invented a non-existent skin condition called "bixonimania" and uploaded two fake studies about it to a preprint server in early 2024. This experiment aimed to test whether large language models (LLMs) would disseminate misinformation. Within weeks, major AI systems like Microsoft Copilot, Google Gemini, Perplexity AI, and OpenAI's ChatGPT began repeating the invented condition as real, advising users on symptoms and treatments. More concerningly, these fake papers were subsequently cited in peer-reviewed literature, including a study in *Cureus* (later retracted), suggesting some researchers rely on AI-generated references without verification. The experiment highlighted LLMs' susceptibility to professional-looking misinformation and the broader challenge of maintaining scientific integrity in the age of AI.
Key takeaway
For CTOs and VPs of Engineering overseeing AI development, this experiment underscores the critical need for advanced misinformation detection and content validation within LLMs. Your teams should prioritize developing and integrating robust fact-checking mechanisms, especially for sensitive domains like health, to prevent the propagation of fabricated information and safeguard user trust. Implement rigorous testing protocols that include adversarial examples to stress-test model reliability.
Key insights
LLMs readily propagate fabricated medical conditions presented in professional-looking academic formats, impacting both public advice and scientific literature.
Principles
- LLMs struggle to filter misinformation.
- Professional formatting increases hallucination rates.
- Trust in scientific literature is vulnerable.
Method
Researchers fabricated a medical condition, "bixonimania," created fake preprints with deliberate red flags, and uploaded them to a preprint server to observe LLM and human citation behavior.
In practice
- Verify AI-generated medical information.
- Scrutinize references from LLM outputs.
- Implement robust content validation for LLMs.
Topics
- Bixonimania
- Large Language Models
- AI Misinformation
- Research Integrity
- AI-generated References
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Ethicist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.