Astragalus: Automatic Configuration Repair for Production Networks

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Software Development & Engineering, Cloud Computing & IT Infrastructure, Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

Astragalus is an automatic configuration repair (ACR) tool designed to address network misconfigurations, a major cause of service outages. Unlike existing "semantic-driven" approaches that struggle with scalability due to complex SMT constraints, Astragalus employs a "syntax-driven" method inspired by automatic program repair. It utilizes a "localize-fix-validate" pipeline to efficiently identify and correct errors. The tool demonstrated high effectiveness, repairing 100% of incidents in synthesized networks and 97.5% in a real network with 15 types of injected errors, averaging 7.36 seconds per repair. It also provided valid suggestions within 6 minutes for four recent incidents in a production network of O(1,000)-O(10,000) devices, proving significantly faster and more scalable than prior solutions like AED and CEL.

Key takeaway

For network operators managing large-scale production networks and struggling with misconfiguration-induced outages, you should evaluate "syntax-driven" automatic configuration repair (ACR) tools like Astragalus. This approach significantly accelerates fault localization and repair, often resolving incidents in seconds, far surpassing the scalability of "semantic-driven" methods. While not every complex root cause is directly identified, these tools provide actionable suggestions that drastically reduce manual troubleshooting time, improving network stability and operational efficiency.

Key insights

Syntax-driven automatic configuration repair offers superior scalability and generality over semantic-driven methods.

Principles

Method

Astragalus employs a "localize-fix-validate" pipeline: localize suspicious lines via SBFL, generate candidate fixes (remove, insert, modify) from existing configurations, then validate using network verifiers.

In practice

Topics

Best for: Research Scientist, IT Professional, Operations Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.