How Veris AI and Lume Security built a self-improving AI agent with Microsoft Foundry
Summary
Veris AI and Lume Security, in collaboration with Microsoft Foundry, have developed a self-improving AI agent system designed to transition AI agents from demos to production environments. This system addresses the challenge of unseen failure modes in production by using a high-fidelity simulation environment built by Veris AI on Microsoft Azure. This environment expands production failures into families of realistic scenarios, generating targeted data to optimize agent behavior through automated context engineering and reinforcement learning. The approach ensures improvements without regressing on previous issues. The solution is demonstrated with a security agent from Lume Security, which leverages an intelligence graph to power policy-aligned agents for security, compliance, and IT workflows, reducing time spent on routine requests by 35-55% and improving decision-making.
Key takeaway
For AI Engineers building production-grade agents, invest early in an orchestration and safety layer, alongside an environment-driven evaluation system. This approach creates a continuous improvement loop, allowing you to ship fixes without regressions and leverage production failures as the highest-signal input to continuously harden your AI systems, ensuring reliability and performance.
Key insights
High-fidelity simulation and automated optimization enable AI agents to self-improve safely in production.
Principles
- Agent evaluation must grade outcomes, not just answers.
- Production failures are high-signal inputs for system hardening.
- Orchestration layers standardize model usage and safety.
Method
The system reconstructs production failures, expands them into scenario variants, stress-tests agents in simulation, and uses grader signals to refine prompts and apply reinforcement learning updates in a closed loop.
In practice
- Use simulation to expand rare production failures.
- Implement LLM-based evaluators for targeted rubrics.
- Validate prompt updates against regression suites.
Topics
- AI Agent Optimization
- High-Fidelity Simulation
- Security Intelligence Graph
- Microsoft Foundry
- Automated Evaluation
Best for: AI Engineer, MLOps Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.