Cluster-Aware Dual-Level Test Specification Generation for Large-Scale Automotive Software Requirements

2026-06-17 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A novel "Cluster-then-Summarize" pipeline has been developed to automate test specification generation for large-scale automotive software requirements, addressing the manual effort and scalability issues in meeting Automotive SPICE SWE.6 and ISO 26262 standards. This three-stage pipeline first embeds requirements using all-MiniLM-L6-v2 sentence transformers, then groups them via UMAP dimensionality reduction and HDBSCAN clustering with an adaptive "min_cluster_size" selection based on Silhouette and Calinski–Harabasz scores. Next, a multi-level map-reduce summarization algorithm, using a batch size of 10 and merge factor of 3, distills each cluster into concise, domain-conformant descriptions, preserving quantitative thresholds and safety integrity levels. Finally, it generates dual-level test specifications—individual requirement verification and cluster-level integration tests—by leveraging cluster topology, nearby-cluster context, and RAG grounded in industry standards. Evaluation across seven automotive datasets demonstrates improved integration test coverage, enhanced summarization fidelity (ROUGE-L 0.3793, BERTScore 0.8908), and higher test quality, with 89.59% overall faithfulness.

Key takeaway

For Machine Learning Engineers tasked with automating ASPICE SWE.6 compliance for large automotive software requirement sets, you should consider implementing a cluster-aware dual-level test generation pipeline. This approach significantly improves integration test coverage and summarization fidelity by leveraging semantic clustering and multi-level summarization. It effectively addresses the scalability challenges of manual processes, ensuring robust and traceable test specifications for safety-critical systems.

Key insights

Clustering requirements before LLM summarization and dual-level test generation significantly improves coverage and fidelity for large-scale automotive software.

Principles

Cluster topology can drive multi-level test generation.
Adaptive clustering parameters enhance scalability.
Map-reduce summarization maintains content fidelity.

Method

Embed requirements (all-MiniLM-L6-v2), reduce dimensions (UMAP), cluster (HDBSCAN with auto "min_cluster_size"), then apply multi-level map-reduce summarization. Generate individual and cluster-level tests with RAG and nearby-cluster context.

In practice

Use UMAP+HDBSCAN for scalable requirement clustering.
Implement map-reduce for LLM summarization of large text sets.
Inject cluster context for improved LLM test generation.

Topics

Automotive SPICE SWE.6
Large Language Models
Requirements Engineering
HDBSCAN Clustering
Test Specification Generation
Retrieval-Augmented Generation

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.