PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

PLAN-S, a novel planner-facing bridge, enhances latent world models (LWMs) for autonomous driving by explicitly decoding a style-conditioned, four-channel semantic cost map from latent representations. This cost map, covering dynamic obstacles, off-road regions, static obstacles, and drivability, is conditioned on ego state and driving style via a dual AdaFiLM mechanism. PLAN-S integrates with regression planners through attention-level fusion and with anchor-score planners via reward-level fusion, keeping host backbones frozen. Validated on ResWorld with nuScenes and WoTE with NAVSIM, PLAN-S reduced L2 error to 0.55 m and the 3 s collision rate by 42% on nuScenes. On NAVSIM, the rule-cost variant achieved 89.4 Predictive Driver Model Score (PDMS), with the learned cost variant providing complementary gains on challenging scenes. The system adds only 0.25 million parameters and runs at 17.0 frames per second.

Key takeaway

For Machine Learning Engineers developing autonomous driving systems, PLAN-S offers a robust method to integrate explicit style-conditioned spatial costs into latent world models. You should consider implementing a similar planner-facing bridge to enhance trajectory safety and interpretability, especially for collision reduction. This approach allows for inspectable risk modeling and diverse style preferences, improving performance on challenging driving scenarios without significant inference overhead.

Key insights

PLAN-S enhances autonomous driving LWMs by explicitly modeling style-conditioned spatial costs for improved controllability and safety.

Principles

Method

PLAN-S decodes a four-channel semantic cost map from BEV latent features, conditioned by ego state and driving style via dual AdaFiLM. It integrates via attention-level fusion for regression planners or reward-level fusion for anchor-score planners.

In practice

Topics

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.