PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models
Summary
PLAN-S, a novel planner-facing bridge, enhances latent world models (LWMs) for autonomous driving by explicitly decoding a style-conditioned, four-channel semantic cost map from latent representations. This cost map, covering dynamic obstacles, off-road regions, static obstacles, and drivability, is conditioned on ego state and driving style via a dual AdaFiLM mechanism. PLAN-S integrates with regression planners through attention-level fusion and with anchor-score planners via reward-level fusion, keeping host backbones frozen. Validated on ResWorld with nuScenes and WoTE with NAVSIM, PLAN-S reduced L2 error to 0.55 m and the 3 s collision rate by 42% on nuScenes. On NAVSIM, the rule-cost variant achieved 89.4 Predictive Driver Model Score (PDMS), with the learned cost variant providing complementary gains on challenging scenes. The system adds only 0.25 million parameters and runs at 17.0 frames per second.
Key takeaway
For Machine Learning Engineers developing autonomous driving systems, PLAN-S offers a robust method to integrate explicit style-conditioned spatial costs into latent world models. You should consider implementing a similar planner-facing bridge to enhance trajectory safety and interpretability, especially for collision reduction. This approach allows for inspectable risk modeling and diverse style preferences, improving performance on challenging driving scenarios without significant inference overhead.
Key insights
PLAN-S enhances autonomous driving LWMs by explicitly modeling style-conditioned spatial costs for improved controllability and safety.
Principles
- Explicitly model risk, drivability, and style preferences.
- Organize latent representations as spatial costs.
- Ensure portability across planner families.
Method
PLAN-S decodes a four-channel semantic cost map from BEV latent features, conditioned by ego state and driving style via dual AdaFiLM. It integrates via attention-level fusion for regression planners or reward-level fusion for anchor-score planners.
In practice
- Use a 4-channel cost map for explicit risk/drivability.
- Apply dual AdaFiLM for style-conditioned modulation.
- Integrate cost maps before final trajectory selection.
Topics
- Autonomous Driving
- Latent World Models
- Semantic Cost Maps
- Driving Style Personalization
- Trajectory Planning
- Deep Learning Architectures
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.