Mistral's Le Chat spreads Iran war disinformation in 60 percent of leading prompts

· Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Fundamental Awareness, quick

Summary

Mistral's Le Chat chatbot exhibited significant error rates when prompted with state-sponsored Iran war disinformation, according to an April 2026 NewsGuard audit. The audit found a 50 percent error rate in English and 56.6 percent in French when tested with ten false claims from Russian, Iranian, and Chinese sources. These claims included fabricated events like a typhus outbreak on the Charles de Gaulle carrier and a supposed Emirati drone attack on Oman. Error rates escalated dramatically based on prompt type: 10 percent for neutral queries, 60 percent for leading queries that presented false claims as fact, and 80 percent for malicious prompts designed to repackage disinformation for social media. Mistral did not provide a comment on these findings, despite a customized, offline version of Le Chat being used by the French Ministry of Defense.

Key takeaway

For AI/ML teams deploying public-facing chatbots, you must implement robust prompt engineering and content moderation strategies. Your models, even those from reputable vendors like Mistral, can easily be manipulated to spread disinformation, particularly when faced with leading or malicious queries. Prioritize rigorous testing across diverse adversarial prompt types to mitigate risks and ensure responsible AI deployment.

Key insights

AI chatbots like Le Chat are highly susceptible to disinformation, especially with leading or malicious prompts.

Principles

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, Tech Journalist, AI Ethicist, Policy Maker

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.