Cluster-Aware Dual-Level Test Specification Generation for Large-Scale Automotive Software Requirements
Summary
A novel "Cluster-then-Summarize" pipeline addresses the challenge of generating test specifications for large-scale automotive software requirements, which traditionally consumes weeks of engineering effort and struggles with standard LLM approaches. This three-stage pipeline overcomes context-window limits and preserves inter-requirement dependencies by embedding requirements with sentence transformers, then grouping them using UMAP dimensionality reduction and HDBSCAN clustering. An automatic minimum cluster size selection, driven by combined normalized Silhouette and Calinski-Harabasz scores, ensures quality. A multi-level map-reduce summarization algorithm distills each cluster into concise, domain-conformant descriptions, preserving quantitative thresholds and safety integrity levels. The system generates test specifications at two levels: individual requirement verification and cluster-level integration tests, utilizing a nearby-cluster context and Retrieval-Augmented Generation grounded in ISO 26262 and ASPICE standards. Evaluation shows improved integration test coverage and summarization fidelity, scaling efficiently to thousands of requirements.
Key takeaway
For AI Engineers and ML Engineers developing solutions for large-scale automotive software, this cluster-aware approach offers a robust method to automate test specification generation. You can overcome LLM context window limitations and significantly improve integration test coverage by implementing a "Cluster-then-Summarize" pipeline. This will streamline compliance with Automotive SPICE SWE.6 requirements and reduce manual effort, ensuring comprehensive verification across thousands of requirements.
Key insights
A "Cluster-then-Summarize" pipeline automates scalable, context-aware test specification generation for automotive software requirements.
Principles
- Inter-requirement dependencies are vital for integration test coverage.
- Clustering enables LLMs to process large requirement sets contextually.
- Dual-level test generation covers individual and integrated feature behaviors.
Method
The "Cluster-then-Summarize" pipeline embeds requirements, clusters them using UMAP/HDBSCAN with quality criteria, and applies multi-level map-reduce summarization. It then generates dual-level test specifications, leveraging nearby-cluster context and RAG.
In practice
- Use UMAP and HDBSCAN for requirement grouping.
- Ground LLM outputs with RAG using ISO 26262 and ASPICE.
- Generate tests for individual requirements and cluster-level integration.
Topics
- Automotive Software
- Test Specification Generation
- Large Language Models
- Requirement Clustering
- Retrieval-Augmented Generation
- Automotive SPICE
- ISO 26262
Best for: AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.