I Built an Azure AI Agent That Passed Every Test. Here’s Why I Still Added a Human Approval Step.
Summary
An Azure AI agent designed for internal access requests, built using Azure OpenAI, Azure AI Search, and Azure Functions, consistently passed all functional, retrieval, tool, safety, and regression tests. Despite this, the author implemented a human approval step for any action that modifies system state. The architecture ensures the agent only prepares proposed actions, which are then routed to a human reviewer via Azure Logic Apps and Microsoft Teams. This approach prioritizes accountability and transparency, employing a risk-based model to determine automation levels for low, medium, and high-risk actions. The system logs all decisions and actions, advocating for a progressive increase in autonomy based on accumulated operational evidence rather than initial test success.
Key takeaway
For AI Engineers deploying agents that modify systems, recognize that passing tests does not equate to production trust. You must design accountability upfront by separating agent recommendations from execution. Implement a human approval step for high-risk actions, using a risk-based model to progressively earn autonomy. This approach ensures safety and auditability, allowing you to ship agents faster while deliberately scaling trust based on operational evidence.
Key insights
AI agent autonomy must be earned through operational evidence, not just test passes, with human approval as a critical control.
Principles
- Accountability is a design-time decision, not post-deployment.
- Production trust for AI agents is earned operationally.
- Separate AI agent recommendation from system execution.
Method
An orchestrator uses Azure OpenAI for reasoning and Azure AI Search for retrieval. It routes proposed actions to a human approval queue via Azure Logic Apps, executing only after explicit human sign-off, ensuring a logged audit trail.
In practice
- Implement permission-aware RAG for data security.
- Version prompts to track behavior changes.
- Validate agent output against a strict JSON contract.
Topics
- Azure AI Agents
- Human-in-the-Loop
- Responsible AI
- AI Testing
- Risk-Based Automation
- MLOps Architecture
- Progressive Autonomy
Best for: AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.