Does RL Expand the Capability Boundary of LLM Agents? A PASS@(k,T) Analysis
Summary
A study by Xin Wang, Zhiyuan Zhai, Wenjing Yan, and Xiaodan Shao, published on April 16, 2026, introduces PASS@(k,T), a new two-dimensional metric to evaluate LLM agents. This metric jointly varies sampling budget "k" and interaction depth "T" to differentiate between capability expansion and efficiency improvement in reinforcement learning (RL) for LLM agents. The research finds that for agentic tool use involving compositional strategies and multiple interaction rounds, RL genuinely expands the capability boundary of LLM agents. This contrasts with static reasoning tasks where RL primarily improves reliability. The RL agent's pass-curve significantly outperforms the base model, with the gap widening at larger "k" values. This expansion is attributed to self-directed exploration, as supervised fine-tuning on similar tasks regresses the boundary.
Key takeaway
For NLP Engineers developing LLM agents for complex, multi-step tool-use applications, this research indicates that integrating reinforcement learning can genuinely expand agent capabilities beyond mere reliability improvements. You should focus RL efforts on tasks requiring compositional, sequential information gathering, as this is where the most significant performance gains and capability expansion are observed. Consider adopting the PASS@(k,T) metric to accurately assess whether RL is truly expanding your agent's capabilities or just improving efficiency.
Key insights
RL expands LLM agent capabilities for compositional tool use, unlike its role in static reasoning.
Principles
- Capability expansion differs from efficiency improvement.
- Self-directed exploration is key for RL agent improvement.
Method
PASS@(k,T) is a two-dimensional metric that jointly varies sampling budget "k" and interaction depth "T" to separate capability expansion from efficiency improvement in LLM agents.
In practice
- Prioritize RL for compositional tool-use tasks.
- Evaluate agents using PASS@(k,T) for nuanced insights.
Topics
- Reinforcement Learning
- LLM Agents
- Agentic Tool Use
- PASS@(k,T) Metric
- Capability Expansion
Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.