TerraTransfer: Learning End-to-End Driving Policies Without Expert Demonstrations
Summary
TerraTransfer introduces an innovative approach to end-to-end autonomous driving, addressing the high costs associated with traditional training methods like collecting and labeling millions of driving frames or expensive closed-loop reinforcement learning on images. This method decouples learning to drive from learning to see, leveraging self-play within vectorized simulators to pretrain a driving policy. This simulator-based training allows for millions of rollout steps per second and generates a rich distribution of challenging scenarios. The pretrained policy's latent space is then aligned with a vision backbone using action KL divergence and a batch-relational low-rank structural loss. Crucially, this process eliminates the need for curated expert demonstrations, requiring only paired (image, scene-state) datasets. The resulting end-to-end policy demonstrates performance that matches or exceeds prior methods on photorealistic 3D Gaussian splatting closed-loop scenarios.
Key takeaway
For autonomous driving engineers developing end-to-end systems, TerraTransfer offers a significant shift by reducing reliance on costly expert demonstrations and extensive data labeling. You can accelerate policy development by leveraging vectorized simulators for self-play, generating diverse training scenarios efficiently. Consider integrating this decoupled learning approach to streamline your training pipeline and achieve competitive performance with fewer real-world data constraints.
Key insights
TerraTransfer decouples driving policy learning from vision, using self-play in simulators to eliminate expert demonstrations.
Principles
- Decouple driving policy from vision.
- Exploit simulator self-play for data.
- Align latent spaces for integration.
Method
Pretrain a policy via self-play in vectorized simulators. Align its latent space with a pretrained vision backbone using action KL divergence and a batch-relational low-rank structural loss, requiring only (image, scene-state) pairs.
In practice
- Reduce expert demonstration costs.
- Generate diverse training scenarios.
- Integrate simulator-trained policies.
Topics
- Autonomous Driving
- End-to-End Driving
- Reinforcement Learning
- Self-Play
- Simulator Training
- Latent Space Alignment
- Gaussian Splatting
Best for: Research Scientist, Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.