Prompt Repetition: Google's Surprisingly Strong Baseline for Non-Reasoning LLMs

2024-06-18 · Source: To Data & Beyond · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Intermediate, short

Summary

A Google Research paper reveals that simply repeating a prompt twice can significantly improve Large Language Model (LLM) performance, with one task showing an accuracy jump from 21.33% to 97.33%. This technique, tested across 7 models (including Gemini, GPT-4o, Claude, DeepSeek) and 7 benchmarks, resulted in wins in 47 out of 70 comparisons with zero statistically significant losses. The improvement stems from LLMs being causal language models that process tokens left-to-right, meaning early tokens cannot "see" later context. Repeating the prompt allows the model to process the entire prompt once, then re-read it with full context, effectively getting a "clean second read." This method does not increase output length or meaningfully increase latency, unlike explicit reasoning chains, and is particularly effective for tasks requiring structured information or when important details appear late in the prompt.

Key takeaway

For NLP Engineers building structured extraction or multiple-choice systems, you should experiment with prompt repetition. Simply appending your original prompt to itself can yield substantial accuracy gains, especially when important details are late in the prompt or explicit reasoning is not used. This low-cost technique, which avoids increased output length or latency, offers a practical way to enhance LLM performance without complex prompt engineering.

Key insights

Repeating prompts can dramatically improve LLM accuracy by providing a "clean second read" with full context.

Principles

LLMs process tokens left-to-right.
Early tokens lack full prompt context.
Duplicated semantic content improves comprehension.

Method

Duplicate the entire prompt and concatenate it, separated by a newline, to force the LLM to re-read instructions with complete context before generating a response.

In practice

Use for structured extraction or list retrieval.
Apply when answer choices precede questions.
Beneficial when chain-of-thought is disabled.

Topics

Prompt Engineering
Large Language Models
Causal Language Models
Model Performance
Structured Data Extraction

Best for: NLP Engineer, AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by To Data & Beyond.