The Range Shrinks, the Threat Remains: Re-evaluating LLM Package Hallucinations on the 2026 Frontier-Model Cohort
Summary
A recent study re-evaluated package name hallucinations by code-generating large language models, replicating Spracklen et al.'s 2025 methodology on five frontier LLMs released from October 2025 to March 2026: Claude Sonnet 4.6, Claude Haiku 4.5, GPT-5.4-mini, Gemini 2.5 Pro, and DeepSeek V3.2. Analyzing 199,845 Python and JavaScript prompts against PyPI and npm, the study found hallucination rates between 4.62% (Claude Haiku 4.5) and 6.10% (GPT-5.4-mini). While this represents an order-of-magnitude compression in inter-model spread compared to previous findings, the "slopsquatting" threat remains. Crucially, 127 package names were identically hallucinated by all five models; after coordinated disclosure, 53 of these (41 on PyPI, 12 on npm) are still registrable, creating a model-agnostic supply-chain attack surface. The research also noted a Python-over-JavaScript hallucination asymmetry and a Jaccard-similarity peak (J = 0.343) between DeepSeek V3.2 and GPT-5.4-mini.
Key takeaway
For AI Security Engineers evaluating supply chain risks from code-generating LLMs, you must recognize that despite reduced inter-model hallucination rate variance, the threat of slopsquatting persists. Your focus should extend beyond individual model vulnerabilities to identifying common hallucinated package names across diverse frontier models. Proactively register these shared, non-existent packages or implement robust internal package validation to mitigate the model-agnostic attack surface revealed by this research.
Key insights
LLM package hallucination rates have converged but still pose a significant, model-agnostic supply-chain security risk.
Principles
- LLM hallucination rates vary by model.
- Shared training data can lead to common hallucinations.
- Supply chain attacks exploit non-existent package names.
Method
The study replicated a methodology using 199,845 paired Python/JavaScript prompts, validating hallucinated package names against PyPI and npm master lists to identify registrable attack surfaces.
In practice
- Identify common hallucinated package names.
- Coordinate disclosure with package registries.
- Monitor for slopsquatting attempts.
Topics
- LLM Hallucinations
- Software Supply Chain Security
- Slopsquatting
- Package Registries
- Code Generation Models
- PyPI npm
Code references
Best for: CTO, Research Scientist, VP of Engineering/Data, AI Scientist, AI Security Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.