Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft

2026-05-14 · Source: AI Engineer · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Cloud Computing & IT Infrastructure · Depth: Intermediate, extended

Summary

Microsoft Foundry's developer relations team, led by Amy Boyd and Nitia, presented a session on "Mind the Gap in Agent Observability," emphasizing the critical need for robust evaluation, monitoring, and optimization of AI agents. The session introduced the Microsoft Foundry platform, a cloud agent platform for end-to-end agent development, hosting, observation, and management. It highlighted the non-deterministic nature of agents in production and proposed a three-pronged approach: evaluating performance, quality, and safety; continuous monitoring over time; and optimizing agent performance using collected data. The presentation detailed how to build agents using the Foundry portal and SDK, covering tracing with OpenTelemetry (OTEL) for debugging, utilizing built-in and custom evaluators for quality, safety, and agentic metrics, and implementing red teaming for proactive vulnerability assessment against adversarial prompts. A key focus was the "Observe skill," an early-preview coding agent that automates the observability loop, including generating evaluation datasets, running batch evaluations, optimizing prompts, and providing version control for agent improvements.

Key takeaway

For AI Engineers building and deploying agents, understanding and implementing comprehensive observability is crucial. You should integrate tracing and evaluation from the earliest development stages, leveraging platforms like Microsoft Foundry to manage agent non-determinism and ensure reliability. Proactively use red teaming to identify vulnerabilities and consider automating the observability loop with coding agents to accelerate the detect-diagnose-fix cycle, ensuring your agents meet evolving requirements and maintain quality in production.

Key insights

Effective AI agent observability requires continuous evaluation, monitoring, and automated optimization to manage non-determinism and ensure reliability.

Principles

Evaluate agents early and throughout their lifecycle.
Trace-linked evaluations shorten detection-to-diagnosis time.
Automate observability loops with coding agents.

Method

Build agents in the Foundry portal or SDK, instrument with OTEL tracing, use built-in/custom evaluators for quality/safety/agentic metrics, and apply red teaming for adversarial testing. Automate with the "Observe skill" for data generation and prompt optimization.

In practice

Fork the provided GitHub repo for hands-on agent development.
Utilize Microsoft Foundry's built-in evaluators for quick insights.
Employ the "Observe skill" to automate evaluation and prompt optimization.

Topics

Agent Observability
Microsoft Foundry Platform
AI Agent Evaluation
OpenTelemetry Tracing
Workflow Agents

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Engineer.