Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

2026-05-30 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Critic-R is a novel framework designed to enhance agentic search systems by explicitly integrating a feedback loop between the reasoning agent and the retrieval model during both inference and training. This system addresses the challenge of optimizing retrievers, which often requires extensive co-training or gold-standard annotations. Critic-R incorporates a critic model that assesses the agent's introspective reasoning trace after consuming retrieved evidence, determining if the context supports the subsequent reasoning step. The framework features two key mechanisms: Critic-R-Zero, an inference-time loop that iteratively refines queries and retrieval instructions, and Critic-Embed, an optimization method for retrieval models that automatically generates supervision from successful and failed refinement trajectories, eliminating the need for manual relevance annotation. Evaluated on datasets including HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle, Critic-R demonstrated significant improvements in both retrieval quality and downstream answer accuracy.

Key takeaway

For Machine Learning Engineers developing agentic search systems, consider integrating introspective feedback mechanisms like Critic-R to enhance retrieval performance. This approach allows your system to self-correct and refine queries, significantly improving both retrieval quality and final answer accuracy without extensive manual annotation. You should explore using successful and failed refinement trajectories as automatic supervision to continuously optimize your retrieval models, reducing development overhead and accelerating deployment.

Key insights

Critic-R improves agentic search by using a critic model to provide introspective feedback for query refinement and retrieval model optimization.

Principles

Explicitly close feedback loops in agentic systems.
Use introspective reasoning traces for context evaluation.
Generate automatic supervision from refinement trajectories.

Method

Critic-R employs a critic model to evaluate reasoning traces, then uses Critic-R-Zero for iterative query refinement and Critic-Embed to optimize retrieval models with self-generated supervision from refinement trajectories.

In practice

Apply introspective feedback to refine search queries.
Optimize retrievers using agent's self-correction data.
Reduce reliance on manual relevance annotations.

Topics

Agentic Search
Retrieval Models
Instruction Tuning
Feedback Loops
Query Refinement
Self-Supervision

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.