BATON: A Multimodal Benchmark for Bidirectional Automation Transition Observation in Naturalistic Driving
Summary
BATON is a new large-scale, naturalistic multimodal dataset designed to improve the prediction of human-to-automation and automation-to-human control transitions in driving. The dataset captures 136.6 hours of real-world driving from 127 drivers, synchronizing front-view video, in-cabin video, CAN bus signals, radar-based lead-vehicle interaction, and GPS-derived route context. Researchers defined three benchmark tasks: driving action understanding, handover prediction, and takeover prediction, evaluating various baselines including sequence models, classical classifiers, and zero-shot VLMs. Results indicate that visual input alone is insufficient for reliable prediction, with CAN and route-context signals significantly enhancing performance. The study also found that takeover events develop more gradually, benefiting from longer prediction horizons, while handover events rely on immediate contextual cues, suggesting an asymmetry with implications for HMI design.
Key takeaway
For research scientists developing advanced driver-assistance systems (ADAS) or autonomous vehicle HMIs, you should prioritize multimodal data integration, especially CAN bus and route context, over video-only approaches. Your HMI designs should account for the asymmetry between takeover and handover events, leveraging longer prediction horizons for takeovers and immediate contextual cues for handovers to enhance safety and user experience.
Key insights
Multimodal data, beyond just video, is crucial for accurately predicting driver-automation transitions in naturalistic driving.
Principles
- Visual input alone is insufficient for reliable transition prediction.
- Takeover events benefit from longer prediction horizons.
- Handover events depend on immediate contextual cues.
Method
BATON collects synchronized multimodal data (video, CAN, radar, GPS) from naturalistic driving to benchmark driving action understanding, handover, and takeover prediction tasks.
In practice
- Integrate CAN bus and route context for robust HMI.
- Design HMI for gradual takeover events.
- Prioritize immediate cues for handover HMI.
Topics
- Driving Automation Transition
- Multimodal Datasets
- Naturalistic Driving
- Handover Prediction
- Takeover Prediction
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.