AI Observability Starter Kit for Microsoft Foundry agents

2026-05-29 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, extended

Summary

The AI Observability Starter Kit for Microsoft Foundry agents provides a comprehensive solution for monitoring AI agent performance and safety in production. This kit, provisioned via a single PowerShell command, deploys a four-agent Microsoft Foundry environment featuring instrumented OpenTelemetry traces, eight built-in evaluators, a custom compliance evaluator, an automated red-team scan, and two scheduled-query alerts. It includes agents targeting gpt-4o-mini, gpt-5-mini, gpt-4.1-mini, and a broken model to demonstrate error handling and comparative metrics. Telemetry flows into Azure Application Insights, powering custom Grafana dashboards for operational insights (e.g., token usage, latency, error rates) and enabling batch evaluation of agent quality and adversarial testing. The entire setup takes 35-50 minutes to deploy and costs approximately \$0.03/day to run.

Key takeaway

For ML engineers and SREs deploying AI agents on Azure, if you need production-grade observability without extensive manual setup, this starter kit offers a validated, automated baseline. You can deploy a comprehensive monitoring environment, including instrumented traces, quality evaluators, red-team testing, and alerts, with a single command. This allows you to quickly establish robust agent monitoring, identify issues like model errors or safety failures, and ensure compliance, significantly reducing your initial setup effort.

Key insights

Production-grade AI observability requires instrumented traces, automated quality and safety evaluations, and proactive alerting.

Principles

AI agent observability extends beyond basic HTTP status.
OpenTelemetry spans form the core of AI agent telemetry.
Combine built-in and custom evaluators for agent quality.

Method

The kit's method involves a single PowerShell script to provision Azure infrastructure, deploy Foundry agents, generate traffic, run batch evaluations (built-in and custom), execute automated red-team scans, and configure Grafana dashboards and scheduled alerts.

In practice

Use "ENABLE_INSTRUMENTATION=true" for OTel child spans.
Set "ENABLE_SENSITIVE_DATA=true" to capture prompts for evaluators.
Implement custom code-based evaluators for domain-specific policies.

Topics

AI Observability
Microsoft Foundry
OpenTelemetry
AI Agent Evaluation
Red Teaming
Azure Application Insights
Grafana Dashboards

Code references

jvargh/ai-observability-starter-kit

Best for: Software Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.