No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions
Summary
A new study introduces "adversarial repackaging," a closed-loop attack demonstrating how AI peer review systems can be manipulated through presentation-only revisions, without altering any scientific evidence, methods, figures, or numerical results. This attack modifies elements like the abstract, contribution framing, related work, and discussion. Across three mainstream AI reviewers, adversarial repackaging achieved a 75.1% attack success rate and a mean score gain of +1.21/10. The research found that strategies such as related-work repositioning and analytical discussion expansion were significantly more effective than superficial edits. The analysis revealed two critical structural failure modes: AI reviewers are more susceptible to being impressed than convinced, and they often mistake the appearance of addressing a limitation for its actual resolution. This suggests that paper presentation itself becomes an optimization surface, posing a significant deployment risk beyond malicious hidden instructions. A contamination-free rolling benchmark and attack framework are released.
Key takeaway
For research scientists submitting papers to AI peer review systems, understand that presentation-level revisions can significantly impact review scores, even without changing scientific content. Focus on strategically framing your contributions and expanding analytical discussions, as these tactics outperform simple prose polishing. If you are developing AI review systems, prioritize anchoring models to scientific evidence to prevent presentation from becoming an exploitable optimization surface.
Key insights
AI peer review is vulnerable to presentation-only manipulation, confusing appearance with scientific merit.
Principles
- AI reviewers prioritize impression over conviction.
- Highlighting strengths boosts perceived merit.
- Presentation is an AI review optimization surface.
Method
Adversarial repackaging is a closed-loop attack using AI-reviewer feedback to search for presentation-level revisions while keeping scientific evidence fixed.
In practice
- Reposition related work for impact.
- Expand analytical discussion sections.
- Avoid solely surface-level prose polishing.
Topics
- AI Peer Review
- Adversarial Repackaging
- Presentation Bias
- Research Integrity
- AI Robustness
- Attack Framework
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, Research Scientist, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.