Cluster-Aware Dual-Level Test Specification Generation for Large-Scale Automotive Software Requirements

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A novel "Cluster-then-Summarize" pipeline has been developed to automate test specification generation for large-scale automotive software requirements, addressing the manual effort and scalability issues in meeting Automotive SPICE SWE.6 and ISO 26262 standards. This three-stage pipeline first embeds requirements using all-MiniLM-L6-v2 sentence transformers, then groups them via UMAP dimensionality reduction and HDBSCAN clustering with an adaptive "min_cluster_size" selection based on Silhouette and Calinski–Harabasz scores. Next, a multi-level map-reduce summarization algorithm, using a batch size of 10 and merge factor of 3, distills each cluster into concise, domain-conformant descriptions, preserving quantitative thresholds and safety integrity levels. Finally, it generates dual-level test specifications—individual requirement verification and cluster-level integration tests—by leveraging cluster topology, nearby-cluster context, and RAG grounded in industry standards. Evaluation across seven automotive datasets demonstrates improved integration test coverage, enhanced summarization fidelity (ROUGE-L 0.3793, BERTScore 0.8908), and higher test quality, with 89.59% overall faithfulness.

Key takeaway

For Machine Learning Engineers tasked with automating ASPICE SWE.6 compliance for large automotive software requirement sets, you should consider implementing a cluster-aware dual-level test generation pipeline. This approach significantly improves integration test coverage and summarization fidelity by leveraging semantic clustering and multi-level summarization. It effectively addresses the scalability challenges of manual processes, ensuring robust and traceable test specifications for safety-critical systems.

Key insights

Clustering requirements before LLM summarization and dual-level test generation significantly improves coverage and fidelity for large-scale automotive software.

Principles

Method

Embed requirements (all-MiniLM-L6-v2), reduce dimensions (UMAP), cluster (HDBSCAN with auto "min_cluster_size"), then apply multi-level map-reduce summarization. Generate individual and cluster-level tests with RAG and nearby-cluster context.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.