Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance
Summary
Recent large language models (LLMs) released in 2024, such as Opus 4.5, demonstrate improved performance on math problems when provided with "filler tokens" or repeated problem statements, a capability not observed in pre-2024 models without specialized training. For instance, Opus 4.5's no-Chain-of-Thought (CoT) performance on a dataset of competition math problems increased from 45% to 51% (p=4e-7) with filler tokens. Repeating the problem statement generally yields similar or slightly better results, especially for weaker models like Qwen3 235B A22B. This performance boost, first observed with Opus 3, scales smoothly with the number of repeats/filler tokens up to a point, after which performance may degrade. The effect is particularly pronounced for arithmetic-heavy problems, suggesting a form of basic meta-cognition without explicit CoT, indicating potentially sophisticated internal states in frontier LLMs.
Key takeaway
For AI Engineers optimizing LLM performance on mathematical tasks, consider integrating filler tokens or repeating problem statements in your prompts, especially for models like Opus 4.5. This simple technique can significantly improve accuracy without requiring complex Chain-of-Thought reasoning, potentially reducing inference latency for certain problem types. Experiment with the optimal number of repeats or filler tokens, as excessive amounts can degrade performance.
Key insights
Recent LLMs can use filler tokens or problem repeats to boost math performance without explicit Chain-of-Thought.
Principles
- Performance scales with filler/repeat tokens.
- Repeats often outperform filler for weaker models.
- A capability threshold exists for effective utilization.
Method
Few-shot prompting with 10 shots, either repeating the problem statement N times or inserting N filler tokens (e.g., "1 2 ... N") before the model's immediate numerical answer.
In practice
- Insert filler tokens for math problems.
- Repeat problem statements for performance gains.
- Test different filler types (e.g., "...", Lorem Ipsum).
Topics
- LLM Math Performance
- Meta-cognition
- Prompt Engineering
- Mathematical Reasoning
- Anthropic Models
Code references
Best for: AI Engineer, NLP Engineer, Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Redwood Research blog.