Waymo leverages Genie 3 to create a world model for self-driving cars

· Source: AI - Ars Technica · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Intermediate, quick

Summary

Waymo and DeepMind have developed the Waymo World Model, an AI system designed to generate both 2D video and 3D lidar outputs from driving scenes. This model, which is not a direct port of Genie 3, uses a specialized post-training process to integrate critical depth information from lidar with visual details from cameras. It enables "driving action control" by allowing Waymo to alter vehicle routes in simulations, providing more realistic and consistent outcomes than previous methods. The model can also generate multimodal sensor data for existing dashcam videos, showing how a self-driving AI would perceive a situation. Furthermore, it can "mutate" real video conditions, such as changing weather, time of day, or adding objects like new signage or unexpected obstacles, to help Waymo vehicles adapt to diverse driving environments, including new markets like Boston and Washington, D.C.

Key takeaway

For AI Scientists developing self-driving systems, the Waymo World Model demonstrates a powerful approach to synthetic data generation. Your teams should explore multimodal simulation techniques that combine visual and depth data to enhance training realism and accelerate adaptation to diverse, challenging environments. This can significantly reduce reliance on purely real-world data collection for edge cases.

Key insights

Waymo's World Model generates multimodal driving simulations for enhanced self-driving AI training and environmental adaptation.

Principles

Method

The Waymo World Model uses a specialized post-training process to generate synchronized 2D video and 3D lidar outputs, enabling driving action control and mutation of real-world video conditions for AI training.

In practice

Topics

Best for: Computer Vision Engineer, AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.