TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety
Summary
TRACE, or Trajectory Risk-Aware Compression for Long-Horizon Agent Safety, addresses the challenge of detecting safety risks in long-horizon LLM agent trajectories. Existing turn-level or short-context detectors often fail to retain and aggregate sparse, delayed, and compositional risk signals over extended contexts. TRACE introduces a Compressor-Reader design where a Compressor encodes the full trajectory into a compact latent evidence state, supervised at the trajectory level. A Reader then uses this latent state as a safety reference to judge the raw trajectory. This approach effectively aggregates dispersed risk cues and minimizes premature evidence loss. TRACE achieved the best accuracy across ASSEBench, Pre-Ex-Bench, and R-Judge benchmarks, outperforming strong baselines by up to 12.6 percentage points. It also demonstrated reduced performance degradation on LongSafety with increasing context length, with code available at https://github.com/Peregrine123/TRACE_official.
Key takeaway
For AI Security Engineers developing long-horizon LLM agents, you should consider integrating a trajectory-level evidence compression system like TRACE. This approach significantly improves the detection of sparse and compositional safety risks that evade traditional short-context methods. Implementing a Compressor-Reader architecture can enhance your agent's ability to aggregate dispersed risk cues. This also reduces performance degradation as context length grows, ensuring more robust safety monitoring.
Key insights
TRACE uses a Compressor-Reader design to compress long LLM agent trajectories into a latent evidence state for improved safety detection.
Principles
- Long-horizon safety requires trajectory-level evidence aggregation.
- Dispersed risk cues benefit from compact latent state representation.
- Context length degradation can be mitigated by compression.
Method
TRACE employs a Compressor to encode full trajectories into a compact latent evidence state under trajectory-level supervision. A Reader then judges the raw trajectory using this latent state as a safety reference.
In practice
- Implement a Compressor-Reader for long-context safety.
- Use trajectory-level supervision for evidence compression.
- Focus on risk-critical segments via compressed references.
Topics
- LLM Agent Safety
- Trajectory Compression
- Risk-Aware AI
- Long-Horizon AI
- Compressor-Reader Architecture
- ASSEBench
Code references
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.