AutoMine Solution for AV2 2026 Scenario Mining Challenge
Summary
AutoMine is a robust, self-refining scenario mining method designed for autonomous driving systems, leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs). Its purpose is to identify high-value, safety-critical, and planning-relevant scenarios from extensive driving logs, crucial for data-driven evaluation. The system incorporates semantics-preserving prompt augmentation to mitigate LLM prompt sensitivity and integrates robust trajectory atomic functions with VLM-based functions to address perception noise and open-world visual cues. Furthermore, AutoMine refines its generated code through execution feedback derived from real driving logs. In the Argoverse 2 Scenario Mining Competition at CVPR 2026, AutoMine achieved a HOTA-Temporal score of 36.38 and a Timestamp BA score of 77.21, demonstrating its effectiveness.
Key takeaway
For autonomous driving engineers evaluating system safety and planning, AutoMine's approach offers a robust method for scenario mining. You should consider integrating LLM prompt augmentation and VLM-based functions to enhance scenario extraction from large driving logs. Implement execution feedback loops to continuously refine your scenario generation code, improving the relevance and criticality of identified situations.
Key insights
AutoMine uses LLMs and VLMs with self-refinement to extract critical autonomous driving scenarios from logs.
Principles
- Reduce LLM prompt sensitivity.
- Combine trajectory functions with VLM.
- Refine code via execution feedback.
Method
AutoMine employs semantics-preserving prompt augmentation for LLMs, integrates robust trajectory atomic functions with VLM-based functions, and refines generated code using execution feedback from real driving logs.
In practice
- Apply prompt augmentation for LLM robustness.
- Integrate VLM for perception noise handling.
- Use execution feedback for code refinement.
Topics
- Autonomous Driving
- Scenario Mining
- Large Language Models
- Vision-Language Models
- Argoverse 2
- CVPR 2026
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.