Retrieval Augmented (Knowledge Graph), and Large Language Model-Driven Design Structure Matrix (DSM) Generation of Cyber-Physical Systems

2022-02-09 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

This research explores the use of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Graph-based RAG (GraphRAG) for automating the generation of Design Structure Matrices (DSMs). The study evaluates these methods across two distinct use cases: a power screwdriver and a CubeSat, both with established architectural references. Performance is measured on two key tasks: determining relationships between predefined components and the more complex challenge of identifying components and their subsequent relationships. The evaluation uses cell-level metrics like accuracy, precision, recall, and F1-score, alongside global graph-based metrics such as edit distance and spectral distance. The findings indicate that model architecture and careful prompt design often influence performance more significantly than model size alone, with specific RAG and GraphRAG configurations showing notable gains. All code is publicly available for reproducibility and expert feedback.

Key takeaway

For AI Scientists developing automated system architecture tools, consider that model architecture and precise prompt engineering are often more impactful than sheer model size for DSM generation. Focus on carefully curating reference documents for RAG and GraphRAG, as simply adding more data does not guarantee improved accuracy. Prioritize models like mixtral:8x22b for physical component interactions and llama3.3:70b for abstract system-level relationships to optimize performance and reduce computational overhead.

Key insights

LLMs, especially with RAG and GraphRAG, can automate Design Structure Matrix generation for complex systems.

Principles

Model architecture can outweigh parameter count in relationship classification.
Careful prompt design is critical for LLM-based architectural generation.
Aggregating all references does not consistently improve RAG performance.

Method

The method involves a three-step procedure: preparing references and design configuration, processing with LLM/RAG/GraphRAG variants, and analyzing results using cell-level and graph-based metrics. It includes semi-automated document classification and prompt engineering.

In practice

Use mixtral:8x22b for spatial reasoning tasks in DSM generation.
Employ llama3.3:70b for high-level, abstract whole-part relationships.
Tune RAG reference selection to avoid irrelevant or conflicting information.

Topics

Large Language Models
Retrieval-Augmented Generation
Knowledge Graphs
Design Structure Matrix
Cyber-Physical Systems

Code references

bankh/xLM_DSM

Best for: AI Scientist, AI Researcher, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.