Prompt Repetition: Google's Surprisingly Strong Baseline for Non-Reasoning LLMs
Summary
A Google Research paper reveals that simply repeating a prompt twice can significantly improve Large Language Model (LLM) performance, with one task showing an accuracy jump from 21.33% to 97.33%. This technique, tested across 7 models (including Gemini, GPT-4o, Claude, DeepSeek) and 7 benchmarks, resulted in wins in 47 out of 70 comparisons with zero statistically significant losses. The improvement stems from LLMs being causal language models that process tokens left-to-right, meaning early tokens cannot "see" later context. Repeating the prompt allows the model to process the entire prompt once, then re-read it with full context, effectively getting a "clean second read." This method does not increase output length or meaningfully increase latency, unlike explicit reasoning chains, and is particularly effective for tasks requiring structured information or when important details appear late in the prompt.
Key takeaway
For NLP Engineers building structured extraction or multiple-choice systems, you should experiment with prompt repetition. Simply appending your original prompt to itself can yield substantial accuracy gains, especially when important details are late in the prompt or explicit reasoning is not used. This low-cost technique, which avoids increased output length or latency, offers a practical way to enhance LLM performance without complex prompt engineering.
Key insights
Repeating prompts can dramatically improve LLM accuracy by providing a "clean second read" with full context.
Principles
- LLMs process tokens left-to-right.
- Early tokens lack full prompt context.
- Duplicated semantic content improves comprehension.
Method
Duplicate the entire prompt and concatenate it, separated by a newline, to force the LLM to re-read instructions with complete context before generating a response.
In practice
- Use for structured extraction or list retrieval.
- Apply when answer choices precede questions.
- Beneficial when chain-of-thought is disabled.
Topics
- Prompt Engineering
- Large Language Models
- Causal Language Models
- Model Performance
- Structured Data Extraction
Best for: NLP Engineer, AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, Prompt Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by To Data & Beyond.