Towards Safety-Aware Mutation Testing for Autonomous Driving Systems

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Expert, extended

Summary

Safety-Aware Mutation Testing (SAMT) is a proposed paradigm shift for evaluating the test adequacy of Autonomous Driving Systems (ADS), addressing the lack of systematic criteria for stopping test scenario generation. Unlike traditional mutation testing, which injects faults into individual components, SAMT systematically injects temporally bounded faults into messages exchanged between ADS modules, simulating realistic interaction failures. This approach, detailed in a vision paper, derives mutant generation rules from top-down safety engineering frameworks like System-Theoretic Process Analysis (STPA). SAMT aims to provide a rigorous mechanism for evaluating test adequacy, enabling automated scenario generation, and guiding ADS repair. The process involves mutant generation, high-fidelity execution in environments like CARLA, equivalent mutant removal, and test suite improvement. This method shifts the testing focus from individual component reliability to system-level safety, particularly for module-based ADS.

Key takeaway

For MLOps Engineers or AI Scientists developing safety-critical Autonomous Driving Systems, you should consider adopting Safety-Aware Mutation Testing (SAMT) to rigorously assess test suite adequacy. This approach helps you identify system-level safety risks arising from module interactions, not just individual component failures. By integrating STPA-derived mutation operators, you can ensure your test scenarios target genuine hazards, guiding automated scenario generation and improving overall system safety.

Key insights

SAMT shifts ADS testing from component reliability to system safety by mutating inter-module messages based on safety engineering principles.

Principles

Method

SAMT generates mutants by injecting temporal faults into inter-module messages, executes them in high-fidelity simulation, removes equivalent mutants, and guides test suite improvement.

In practice

Topics

Best for: Research Scientist, Computer Vision Engineer, AI Scientist, Robotics Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.