SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval
Summary
SkillPager is a two-stage framework designed to optimize context retrieval for skill-based LLM agents interacting with long procedural documents. It addresses the inefficiency of full-document prompting by parsing Markdown skill documents into typed semantic nodes offline. Online, SkillPager employs Maximal Marginal Relevance (MMR) for query-conditioned node selection, aiming to provide a minimal, execution-sufficient context. On a benchmark comprising 395 skills and 1,975 queries, SkillPager achieved 78.89% LLM-judged context sufficiency, closely approaching the 82.23% of a full-document baseline, while significantly reducing prompt tokens by 47.04%. Further analysis revealed that its efficiency gains stem from typed semantic granularity, not just the retrieval algorithm, outperforming graph-based baselines by 12.16%.
Key takeaway
For Machine Learning Engineers developing LLM agents that rely on extensive procedural documents, you should consider implementing query-adaptive intra-skill retrieval. SkillPager's approach of parsing documents into typed semantic nodes and using MMR for context selection can reduce prompt tokens by 47.04% while maintaining high context sufficiency. This method offers a significant efficiency gain over full-document prompting, allowing you to deploy more cost-effective and performant agents.
Key insights
Typed semantic node retrieval significantly reduces LLM prompt tokens while maintaining high context sufficiency for skill-based agents.
Principles
- Semantic granularity drives retrieval efficiency.
- Adaptive content selection outperforms static heuristics.
- Intra-document retrieval is a distinct access problem.
Method
SkillPager parses Markdown skills into typed semantic nodes offline, then uses Maximal Marginal Relevance (MMR) for global, query-conditioned node selection online to create minimal, execution-sufficient context.
In practice
- Parse documents into typed semantic nodes.
- Employ MMR for context selection.
- Retain supporting content for adaptive selection.
Topics
- SkillPager
- LLM Agents
- Intra-Skill Retrieval
- Semantic Node Parsing
- Maximal Marginal Relevance
- Information Retrieval
Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.