Chameleon: Control-Indexed Prospective Memory for Visuomotor Manipulation

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Chameleon, a novel bio-inspired memory architecture, addresses perceptual aliasing in long-horizon robotic manipulation tasks where decision-time observations are insufficient. Developed by MARS Lab, Nanyang Technological University, Institute for Infocomm Research, A*STAR, Singapore, and National University of Singapore, Chameleon employs geometry-grounded multimodal tokens and a differentiable memory stack for goal-directed recall. It was evaluated on Camo-Dataset, a real-robot UR5e dataset featuring episodic recall, spatial tracking, and sequential manipulation under perceptual aliasing. Chameleon consistently outperformed strong baselines like Diffusion Policy, Flow Matching, and ACT, achieving 100.0% DSR (100.0% κ) in episodic recall, 73.5% DSR (60.3% κ) in spatial tracking, and 72.2% DSR (71.2% κ) in sequential tasks. The system processes observations into view-consistent patch tokens, uses a hierarchical memory with episodic and working states, and incorporates a HoloHead for latent imagination, enabling real-time control with an 82 ms inference latency.

Key takeaway

For robotics engineers developing long-horizon manipulation systems, Chameleon demonstrates that integrating bio-inspired episodic memory is crucial for overcoming perceptual aliasing. You should consider implementing geometry-grounded encoding and goal-directed recall mechanisms to ensure reliable decision-making in non-Markovian environments. This approach significantly improves control stability and task completion, especially in scenarios with occluded or transient task-relevant states, reducing failures from ambiguous observations.

Key insights

Bio-inspired episodic memory with geometry-grounded encoding and goal-directed recall improves robotic manipulation under perceptual aliasing.

Principles

Method

Chameleon uses a Perception→Memory→Policy pipeline. Perception creates geometry-grounded tokens. Memory uses hierarchical episodic and working states with HoloHead for predictive imagination. Policy generates end-effector trajectories via conditional flow matching.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.