PHASER: Phase-Aware and Semantic Experience Replay for Vision-Language-Action Models
Summary
PHASER is a novel, architecture-agnostic continual learning framework designed to mitigate catastrophic forgetting in Vision-Language-Action (VLA) models used for robotic manipulation. Traditional experience replay (ER) methods struggle in open-ended environments by uniformly sampling, which under-samples critical sub-skills and overlooks varying forgetting risks. PHASER addresses this with a phase-centric capacity allocation, ensuring equal memory support for all sub-skills, and a multi-modal interference routing strategy that prioritizes historical phases prone to forgetting. It also integrates Auto-PC, an unsupervised pipeline combining action-signal change-point detection with VLM-based semantic verification for autonomous temporal boundary extraction. Evaluated across three VLA backbones on LIBERO continual learning suites, PHASER demonstrated substantial improvements, increasing Average Success Rate (ASR) by up to 31% over matched-budget ER and achieving an 87.8% final ASR on the LIBERO-Goal CL setting. This work was published on 2026-06-02.
Key takeaway
For Machine Learning Engineers deploying Vision-Language-Action (VLA) models in dynamic robotic environments, you should re-evaluate your continual learning strategies. Traditional experience replay risks catastrophic forgetting of critical sub-skills. Instead, consider implementing phase-aware memory allocation and dynamic prioritization of high-risk historical phases, as demonstrated by PHASER's 31% ASR improvement. This approach ensures robust skill retention and autonomous adaptation, crucial for lifelong learning systems.
Key insights
PHASER prevents catastrophic forgetting in VLA models by intelligently prioritizing critical sub-skills and high-risk phases during experience replay.
Principles
- Continual learning needs phase-aware memory allocation.
- Forgetting risk varies across historical tasks.
- Autonomous temporal boundary detection is crucial.
Method
PHASER employs phase-centric capacity allocation and multi-modal interference routing. It integrates Auto-PC, combining unsupervised action-signal change-point detection with VLM-based semantic verification to extract temporal boundaries.
In practice
- Implement phase-centric memory for sub-skills.
- Prioritize replay of high-forgetting-risk phases.
- Use VLM-based verification for task segmentation.
Topics
- Vision-Language-Action Models
- Continual Learning
- Catastrophic Forgetting
- Robotic Manipulation
- Experience Replay
- Phase-Aware Learning
Best for: Research Scientist, AI Scientist, Robotics Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.