Training Composer for longer horizons

2026-03-17 · Source: Cursor Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, medium

Summary

Composer, a specialized model for agentic coding, has been trained for long-horizon tasks using a novel reinforcement learning process called "self-summarization," which integrates compaction directly into the training loop. This method allows Composer to learn to identify and preserve critical information, enabling it to work on challenging coding tasks requiring hundreds of actions and exceeding typical model context windows. Unlike traditional compaction techniques that risk information loss, "self-summarization" significantly reduces compaction error by 50% on CursorBench, even compared to highly tuned prompt-based baselines, while being five times more token-efficient and reusing the KV cache. Composer achieves this by generating its own condensed context (around 1,000 tokens) from a minimal prompt, demonstrating its ability to solve complex problems like "make-doom-for-mips" by summarizing over 100,000 tokens. This advancement represents a crucial step towards training more capable agentic systems for even longer and more complex processes, including multi-agent coordination.

Key takeaway

Composer, an agentic coding model, significantly improves performance on long-horizon tasks by learning "self-summarization" through reinforcement learning. This "compaction-in-the-loop" training reduces compaction error by 50% and uses one-fifth the tokens of prompt-based baselines, enabling solutions to complex problems like "make-doom-for-mips." This breakthrough allows practical deployment of agents requiring hundreds of actions and extensive reasoning by efficiently preserving critical context.

Topics

Reinforcement Learning
Self-Summarization
AI Agents
Context Management
Code Generation

Best for: AI Scientist, Research Scientist, AI Researcher, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Cursor Blog.