AITP: Traffic Accident Responsibility Allocation via Multimodal Large Language Models

2026-04-24 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Data Science & Analytics · Depth: Expert, extended

Summary

AITP (Artificial Intelligence Traffic Police) is a novel multimodal large language model (MLLM) designed for Traffic Accident Responsibility Allocation (TARA), a complex task requiring causal reasoning and legal knowledge integration. Developed by researchers at Shanghai Jiao Tong University, AITP addresses limitations of existing MLLMs in TARA by employing a Multimodal Chain-of-Thought (MCoT) mechanism for enhanced reasoning and Retrieval-Augmented Generation (RAG) for incorporating traffic regulations. The researchers also introduce DecaTARA, a new decathlon-style benchmark dataset comprising 67,941 annotated videos and 195,821 question-answer pairs across ten interrelated traffic accident reasoning tasks. Experiments demonstrate that AITP achieves state-of-the-art performance in responsibility allocation, traffic accident detection (TAD), and traffic accident understanding (TAU), establishing a new paradigm for reasoning-driven multimodal traffic analysis.

Key takeaway

For research scientists developing safety-critical AI systems, AITP's approach to TARA offers a robust framework for integrating complex reasoning and legal grounding. You should consider adopting progressive fine-tuning and multimodal chain-of-thought (MCoT) with retrieval-augmented generation (RAG) to enhance model reliability and interpretability in similar high-stakes decision-making applications, particularly where legal compliance is critical.

Key insights

AITP uses MCoT and RAG with DecaTARA to achieve state-of-the-art traffic accident responsibility allocation.

Principles

Progressive training improves multimodal understanding.
Structured reasoning (MCoT) enhances decision stability.
Legal knowledge (RAG) grounds responsibility judgments.

Method

AITP employs a four-stage progressive fine-tuning strategy on Qwen3-VL, followed by an MCoT inference pipeline that integrates RAG to retrieve and apply legal clauses for responsibility allocation.

In practice

Use MCoT for multi-step, verifiable evidence accumulation.
Integrate RAG with external knowledge for legally-grounded reasoning.
Employ progressive fine-tuning for complex, multi-task learning.

Topics

Multimodal Large Language Models
Traffic Accident Responsibility Allocation
DecaTARA Dataset
Multimodal Chain-of-Thought
Retrieval-Augmented Generation

Code references

zijinzhou2005/AITP

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.