One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

The FORGE (Fake Online Recommendations in Generative Environments) benchmark evaluates how search-augmented large language models (LLMs) promote fake products due to web content pollution. This benchmark simulates pollution by locally rewriting real products in retrieved web pages into fake ones across 225 products, 15 categories, and 5 consumer scenarios. Findings reveal all 12 commercial and open-weights LLMs are vulnerable; a single polluted page fools models up to 27%, while replacing the top-3 results increases this to 73.8%. Vulnerability correlates with models' lack of prior product knowledge. Reasoning often exacerbates the issue by generating spurious social proof. Evaluated defenses like skepticism prompting can backfire, increasing fooled rates by up to 44 pp on Gemini 3.1 Pro, while consensus filtering risks suppressing 52%-79% of legitimate recommendations.

Key takeaway

For machine learning engineers deploying search-augmented LLM recommenders, recognize that current models are highly susceptible to web content pollution, even from minimal sources. Relying on skepticism prompts or post-hoc consensus filtering is insufficient and can even worsen vulnerability. Instead, prioritize implementing robust retrieval-time defenses such as source-credibility weighting, evidence diversification, and strong cross-document corroboration to build pollution-resilient systems.

Key insights

Generative recommenders are highly vulnerable to web content pollution, even from a single source, with current defenses largely ineffective.

Principles

LLM vulnerability increases with thin prior knowledge.
Reasoning can amplify, not mitigate, fake content promotion.
Skepticism prompts can backfire on closed-source models.

Method

FORGE locally rewrites real product mentions in retrieved web pages into fake ones, then measures how often LLMs recommend the fake product.

In practice

Prioritize retrieval-time source-credibility weighting.
Diversify evidence sources for recommenders.
Employ cross-document corroboration for validation.

Topics

Generative Recommenders
Web Content Pollution
LLM Vulnerability
FORGE Benchmark
Retrieval-Augmented Generation
Prompt Engineering
Fake Product Promotion

Code references

leoluolol/forge-benchmark

Best for: AI Architect, Research Scientist, CTO, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.