Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning
Summary
LaMR (Latent Multi-Rubric) is a new structured pruning framework designed to optimize context for LLM-powered coding agents. It addresses the inefficiency of existing pruners that use a single-objective sequence labeler, which struggles with the varied relevance patterns in code. LaMR decomposes code relevance into two distinct quality dimensions: semantic evidence and dependency support. Each dimension is modeled by a dedicated Conditional Random Field (CRF) with specific transition dynamics. A mixture-of-experts gating network dynamically weights these per-rubric emissions based on the query, and a final CRF layer makes the aggregate keep-or-prune decision. The framework derives multi-rubric labels from existing training corpora using AST-based program analysis, simultaneously denoising binary labels. Experiments across four benchmarks (SWE-Bench Verified, SWE-QA, LCC, LongCodeQA) demonstrate that LaMR wins 12 of 16 multi-turn comparisons, saving up to 31% more tokens and improving Exact Match by up to +3.5 on single-turn tasks.
Key takeaway
For research scientists developing LLM-powered coding agents, you should consider adopting multi-rubric context pruning frameworks like LaMR. This approach can significantly reduce token consumption by up to 31% and improve task performance, particularly Exact Match scores by up to +3.5, by more effectively filtering irrelevant code and denoising the input context. Implementing dimension-specific relevance models can overcome the limitations of single-objective pruners.
Key insights
Decomposing code relevance into semantic and dependency dimensions improves context pruning for coding agents.
Principles
- Heterogeneous retention patterns require multi-faceted modeling.
- AST-based analysis can derive multi-rubric labels.
- Denoising context can enhance agent performance.
Method
LaMR uses dedicated CRFs for semantic evidence and dependency support, a mixture-of-experts for dynamic weighting, and a final CRF for the pruning decision.
In practice
- Apply AST-based analysis for label generation.
- Consider multi-rubric approaches for context compression.
- Evaluate pruning against full-context baselines.
Topics
- Context Pruning
- Coding Agents
- LaMR Framework
- Conditional Random Fields
- AST-based Program Analysis
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.