CAX-Agent: A Lightweight Agent Harness for Reliable APDL Automation
Summary
CAX-Agent is a lightweight agent harness designed for reliable automation of MAPDL finite-element simulations, addressing challenges like inconsistent outputs and task failures in LLM-driven engineering. Its architecture features three layers: an LLM service, the agent harness, and a solver backend, incorporating a recovery ladder that escalates from deterministic rule-based patching to model-driven regeneration and human intervention. An empirical evaluation compared three recovery strategies (no_recovery, rule_only, model_only) on 50 standard structural benchmarks, each run three times, totaling 450 case-runs. Model_only achieved the highest completion rate (0.9267), task score (3.59/4), total score (9.16/10), and zero-intervention rate (0.84), significantly outperforming rule_only and no_recovery. The study used simple geometries to isolate recovery policy effects, with human raters scoring task completion under blind conditions, showing strong inter-rater agreement (Cohen's kappa = 0.84).
Key takeaway
For Machine Learning Engineers developing LLM-driven simulation tools, integrating a dedicated agent harness with model-driven recovery is crucial. Your systems will achieve significantly higher completion rates and autonomy, reducing manual intervention compared to rule-based or no-recovery approaches. Consider adopting a layered recovery strategy, starting with deterministic rules and escalating to LLM-driven regeneration, to enhance reliability in complex engineering workflows.
Key insights
An agent harness with model-driven recovery significantly improves reliability and autonomy in LLM-driven engineering simulations.
Principles
- Orchestrators, not LLMs, should manage retry budgets and stop conditions.
- Layered recovery strategies enhance system robustness.
- Domain-native harnesses outperform generic frameworks for specific tasks.
Method
CAX-Agent employs a three-layer architecture (LLM service, agent harness, solver backend) with a recovery ladder escalating from deterministic rule patching to LLM-driven regeneration, context enrichment, and human intervention.
In practice
- Implement model-driven recovery for higher simulation completion rates.
- Prioritize domain-specific harness design for MAPDL automation.
- Use a multi-axis scoring system for comprehensive evaluation.
Topics
- CAX-Agent
- Agent Harness Paradigm
- MAPDL Automation
- LLM-driven Simulation
- Recovery Policies
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.