Does RL Expand the Capability Boundary of LLM Agents? A PASS@(k,T) Analysis

2026-04-16 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A study by Xin Wang, Zhiyuan Zhai, Wenjing Yan, and Xiaodan Shao, published on April 16, 2026, introduces PASS@(k,T), a new two-dimensional metric to evaluate LLM agents. This metric jointly varies sampling budget "k" and interaction depth "T" to differentiate between capability expansion and efficiency improvement in reinforcement learning (RL) for LLM agents. The research finds that for agentic tool use involving compositional strategies and multiple interaction rounds, RL genuinely expands the capability boundary of LLM agents. This contrasts with static reasoning tasks where RL primarily improves reliability. The RL agent's pass-curve significantly outperforms the base model, with the gap widening at larger "k" values. This expansion is attributed to self-directed exploration, as supervised fine-tuning on similar tasks regresses the boundary.

Key takeaway

For NLP Engineers developing LLM agents for complex, multi-step tool-use applications, this research indicates that integrating reinforcement learning can genuinely expand agent capabilities beyond mere reliability improvements. You should focus RL efforts on tasks requiring compositional, sequential information gathering, as this is where the most significant performance gains and capability expansion are observed. Consider adopting the PASS@(k,T) metric to accurately assess whether RL is truly expanding your agent's capabilities or just improving efficiency.

Key insights

RL expands LLM agent capabilities for compositional tool use, unlike its role in static reasoning.

Principles

Capability expansion differs from efficiency improvement.
Self-directed exploration is key for RL agent improvement.

Method

PASS@(k,T) is a two-dimensional metric that jointly varies sampling budget "k" and interaction depth "T" to separate capability expansion from efficiency improvement in LLM agents.

In practice

Prioritize RL for compositional tool-use tasks.
Evaluate agents using PASS@(k,T) for nuanced insights.

Topics

Reinforcement Learning
LLM Agents
Agentic Tool Use
PASS@(k,T) Metric
Capability Expansion

Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.