BIM-Edit: Benchmarking Large Language Models for IFC-Based Building Information Modeling

· Source: Artificial Intelligence · Field: Construction & Real Estate — Construction Technology & Building, Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

BIM-Edit is a new benchmark designed to evaluate large language models (LLMs) on natural-language editing of Building Information Models (BIM) represented in the Industry Foundation Classes (IFC) format. This benchmark addresses a critical gap in existing CAD evaluations, which often prioritize new model generation over editing existing scenes while preserving semantics and relations. BIM-Edit comprises 324 editing tasks across 11 realistic building models and 36 synthetic scenes. These tasks are categorized into direct, spatial, and topological instructions, covering explicit and scene-grounded edits. Evaluation focuses on geometric accuracy, semantic validity, and topological consistency. Current LLMs show significant limitations, with the best-performing model achieving only a 49.5% average score across the three metrics and fully solving no more than 3.4% of tasks, highlighting a substantial capability gap for structured engineering design.

Key takeaway

For AI Scientists and Machine Learning Engineers developing LLMs for CAD or BIM applications, you must prioritize robust semantic and topological understanding over mere geometric generation. Your current models, even the best, fall far short of practical engineering requirements, achieving only 49.5% accuracy on complex editing tasks. Focus your research on improving LLM capabilities to preserve existing scene semantics and relations during edits, rather than just creating new geometry.

Key insights

LLMs struggle significantly with complex, semantic-preserving edits of BIM models, indicating a major gap for engineering design.

Principles

Method

BIM-Edit evaluates LLMs on natural-language editing of IFC-based BIMs using 324 tasks across 11 realistic and 36 synthetic scenes, assessing geometric accuracy, semantic validity, and topological consistency.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.