How to Measure Token Impact of MCP Tool Invocation in Microsoft Foundry

2026-06-12 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, short

Summary

Microsoft Foundry users often encounter discrepancies when measuring token impact from Model Context Protocol (MCP) tool invocations, with API, portal trace, and trajectory views showing different counts. This article details a reproducible method for enterprise token accounting, addressing these inconsistencies. It validates the approach using a Microsoft Foundry prompt agent with an inline MCP tool connected to a remote weather MCP server via devtunnel. The core components for evidence collection include API invocation usage objects, the Microsoft Foundry Traces table, and the Trajectory view. The observed behavior confirms that MCP invocation, visible through "mcp_list_tools" and "execute_tool" spans, increases turn-level token usage, with accounting appearing in the model's response metadata. The proposed solution involves an A/B comparison using API usage as primary proof and portal traces for operational evidence, demonstrating a +659 total-token increase in a specific validation scenario.

Key takeaway

For MLOps Engineers or FinOps owners managing AI model costs in Microsoft Foundry, accurately attributing token usage from MCP tool invocations is critical. You should standardize an evidence pattern that separates API usage for precise per-response accounting from portal trace evidence for operational transparency. Implement the A/B comparison method with baseline runs using identical prompts to establish defensible token deltas, integrating these findings into your Azure cost analysis workflows to reduce review cycles.

Key insights

Token accounting for MCP tool invocation in Microsoft Foundry requires disciplined evidence handling across disparate telemetry sources.

Principles

Separate API usage from portal trace evidence for accurate accounting.
Compare token rows only across identical response IDs.
Baseline runs with same prompts are essential for defensible deltas.

Method

Establish API A/B and portal trace comparison paths. Run MCP-enabled and baseline agents with identical prompts, capturing API usage and portal trace screenshots. Reconcile disparate evidence sources with a clear statement.

In practice

Use API usage for strict per-response accounting.
Employ portal traces for run observability.
Capture baseline and MCP-enabled runs with same prompt.

Topics

Microsoft Foundry
Token Accounting
Model Context Protocol
AI Agents
FinOps
Azure Cost Analysis

Best for: AI Engineer, MLOps Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.