When to Think Deeply: Inhibitory Deliberation for LLM Reasoning
Summary
The IDPR (Inhibitory Deliberative Problem Reasoning) framework is proposed to optimize Large Language Model (LLM) reasoning by selectively invoking computationally expensive "slow reasoning." Unlike traditional input-only routers, IDPR first generates a concise "fast answer" and then uses a response-conditioned inhibition controller to decide whether to release this fast answer or suppress it in favor of deeper deliberation. The controller bases its decision on the fast answer itself and "fast-side evidence" like confidence, logit margin, parseability, and generation cost. Evaluated on a 5,000-example mathematical reasoning test set, IDPR invoked slow reasoning on only 8.20% of examples, improving accuracy from 47.90% to 48.92%. This significantly outperformed random routing (46.76% accuracy) and confidence-based baselines (48.22% accuracy) under the same slow-call budget, demonstrating its ability to identify fast answers that benefit most from slow reasoning.
Key takeaway
For AI Architects designing cost-aware LLM systems, you should consider implementing response-conditioned inhibitory deliberation. This approach allows your system to achieve higher accuracy on complex reasoning tasks, like mathematical problems, by selectively invoking expensive slow reasoning only when a fast answer is predicted to be unreliable. Calibrate your inhibition threshold to balance accuracy gains against increased token costs, especially for harder problem types.
Key insights
LLMs can selectively invoke costly slow reasoning by inhibiting fast answers based on response-conditioned evidence.
Principles
- Routing decisions should be response-conditioned.
- Control should be recruited selectively.
- Estimate slow-over-fast quality gain.
Method
IDPR generates a fast answer, then an inhibition controller uses "fast-side evidence" (confidence, parseability, cost) to compute a switch score. If the score exceeds a threshold, the fast answer is suppressed for slow reasoning.
In practice
- Use fast-side evidence for routing decisions.
- Calibrate inhibition threshold for accuracy-cost trade-off.
- Prioritize slow reasoning for harder problem subsets.
Topics
- LLM Reasoning
- Cost-Aware AI
- Inhibitory Deliberation
- Response-Conditioned Routing
- Mathematical Reasoning
- Cognitive Control
Code references
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.