Generally Available: Evaluations, Monitoring, and Tracing in Microsoft Foundry

2026-03-16 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Advanced, medium

Summary

Microsoft Foundry has made its Evaluations, Monitoring, and Tracing capabilities generally available through the Foundry Control Plane, deeply integrating AI agent observability with Azure Monitor. This release addresses the challenge of maintaining AI agent quality in production, where factors like foundation model updates, prompt changes, retrieval pipeline drift, and real-world traffic can degrade performance. Foundry's approach emphasizes continuous evaluation across the entire AI lifecycle, from local development to CI/CD and live production monitoring. It offers built-in evaluators for critical dimensions like Coherence, Relevance, Groundedness, Retrieval Quality, and Safety, alongside support for custom LLM-as-a-Judge and code-based evaluators. All observability data is published to Azure Monitor, enabling cross-stack correlation, unified alerting, and enterprise governance, with evaluation results directly linked to OpenTelemetry-based traces for efficient root cause analysis. Additionally, a Prompt Optimizer (public preview) helps systematically improve prompt engineering.

Key takeaway

For CTOs and VPs of Engineering deploying AI agents, relying solely on pre-deployment evaluations is insufficient for sustained production quality. You should adopt continuous evaluation and integrated observability solutions like Microsoft Foundry's to monitor agent performance against live traffic, correlate AI-specific metrics with broader infrastructure telemetry, and rapidly diagnose issues. This approach ensures your AI investments remain robust and aligned with operational standards, mitigating risks from model drift and unexpected edge cases.

Key insights

Continuous evaluation and integrated observability are crucial for maintaining AI agent quality in dynamic production environments.

Principles

Evaluation must be continuous, not episodic.
AI observability should integrate with existing infrastructure monitoring.
Link evaluation results directly to traces for root cause analysis.

Method

Foundry's method involves continuous evaluation using built-in and custom evaluators, publishing all observability data to Azure Monitor for unified alerting and correlation, and linking evaluation results to OpenTelemetry-based traces for detailed diagnostics.

In practice

Use built-in evaluators for core quality and safety checks.
Implement custom evaluators for domain-specific criteria.
Configure Azure Monitor alerts for AI quality degradation.

Topics

AI Agent Observability
Continuous Evaluation
Retrieval-Augmented Generation
Prompt Engineering
Azure Monitor

Best for: CTO, VP of Engineering/Data, Director of AI/ML, MLOps Engineer, AI Operations Specialist, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.