Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance

2024-06-17 · Source: Redwood Research blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, long

Summary

Recent large language models (LLMs) released in 2024, such as Opus 4.5, demonstrate improved performance on math problems when provided with "filler tokens" or repeated problem statements, a capability not observed in pre-2024 models without specialized training. For instance, Opus 4.5's no-Chain-of-Thought (CoT) performance on a dataset of competition math problems increased from 45% to 51% (p=4e-7) with filler tokens. Repeating the problem statement generally yields similar or slightly better results, especially for weaker models like Qwen3 235B A22B. This performance boost, first observed with Opus 3, scales smoothly with the number of repeats/filler tokens up to a point, after which performance may degrade. The effect is particularly pronounced for arithmetic-heavy problems, suggesting a form of basic meta-cognition without explicit CoT, indicating potentially sophisticated internal states in frontier LLMs.

Key takeaway

For AI Engineers optimizing LLM performance on mathematical tasks, consider integrating filler tokens or repeating problem statements in your prompts, especially for models like Opus 4.5. This simple technique can significantly improve accuracy without requiring complex Chain-of-Thought reasoning, potentially reducing inference latency for certain problem types. Experiment with the optimal number of repeats or filler tokens, as excessive amounts can degrade performance.

Key insights

Recent LLMs can use filler tokens or problem repeats to boost math performance without explicit Chain-of-Thought.

Principles

Performance scales with filler/repeat tokens.
Repeats often outperform filler for weaker models.
A capability threshold exists for effective utilization.

Method

Few-shot prompting with 10 shots, either repeating the problem statement N times or inserting N filler tokens (e.g., "1 2 ... N") before the model's immediate numerical answer.

In practice

Insert filler tokens for math problems.
Repeat problem statements for performance gains.
Test different filler types (e.g., "...", Lorem Ipsum).

Topics

LLM Math Performance
Meta-cognition
Prompt Engineering
Mathematical Reasoning
Anthropic Models

Code references

rgreenblatt/no_cot_math_public

Best for: AI Engineer, NLP Engineer, Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Redwood Research blog.