Mining Architectural Quality Under Agentic AI Adoption: A Causal Study of Java Repositories

2026-06-11 · Source: Artificial Intelligence · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A causal study investigated the impact of agentic AI coding tool adoption on architectural quality in 151 open-source Java repositories. Researchers identified 74 repositories with detectable AI adoption and 77 propensity-matched controls. Analyzing 1,811 monthly Arcan snapshots over a 13-month window, the study used a staggered difference-in-differences design and the Borusyak imputation estimator. Findings indicate total architectural smell counts remained essentially unchanged (+1.1%, p = 0.82), while lines of code grew significantly (+12.8%, p = 0.003). The resulting 6.7% decline in Architectural Smell Density (ASD, p = 0.004) was attributed to this denominator effect, not actual architectural improvement. The complete replication package, including the curated monthly panel, is publicly available.

Key takeaway

For Research Scientists evaluating AI coding tool impact, do not rely solely on density-normalized metrics like Architectural Smell Density (ASD). Your analysis should explicitly decompose effects on raw counts and system size, as a reported ASD decline might merely reflect increased lines of code, not improved architecture. This ensures accurate assessment of AI tool contributions to software quality.

Key insights

AI coding tool adoption does not inherently improve software architecture quality; density metrics can be misleading.

Principles

Density-normalized outcomes can mislead when treatment affects system size.
Raw counts and explicit decomposition are required for causal mining studies.

Method

A staggered difference-in-differences design with the Borusyak imputation estimator was applied to 1,811 monthly Arcan snapshots from 151 Java repositories to estimate causal effects.

In practice

Use raw architectural smell counts for causal studies.
Explicitly decompose effects on system size and quality.

Topics

Agentic AI
Software Architecture
Java Repositories
Causal Inference
Architectural Smells
Code Quality

Best for: AI Scientist, Research Scientist, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.