Kimi-K2.5 Now in Microsoft Foundry

2026-02-06 · Source: Microsoft Foundry Blog articles · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, quick

Summary

Moonshot AI's Kimi K2.5, a next-generation multimodal and agentic model, is now accessible through Microsoft Foundry. This release introduces significant enhancements, including native multimodality achieved by pre-training with 15 trillion additional vision-text tokens, which boosts image/video understanding, OCR, and multimodal QA. The model also features Agent Swarm execution, capable of orchestrating up to 100 parallel agents and 1,500 tool calls, leading to a 4.5x reduction in execution time compared to sequential K2 workflows. Furthermore, Kimi K2.5 offers stronger image/video to code capabilities, encompassing visual debugging and UI reconstruction from visual inputs. Moonshot AI reports state-of-the-art benchmark results, including 96.1% on AIME 2025, 87.1% on MMLUPro, and 78.5% on MMMUPro (Vision). Input tokens are priced at $0.60 per 1M, and output tokens at $3 per 1M.

Key takeaway

For CTOs evaluating advanced AI models for integration, Kimi K2.5's availability in Microsoft Foundry offers a compelling option due to its multimodal capabilities and Agent Swarm execution. Your teams can leverage its enhanced vision-language understanding and accelerated task completion for complex coding and QA projects, potentially reducing development cycles and improving output quality. Consider piloting Kimi K2.5 for applications requiring robust visual debugging or UI reconstruction from visual inputs.

Key insights

Kimi K2.5 integrates advanced multimodality and agentic execution for enhanced AI performance and efficiency.

Principles

Multimodal pre-training improves vision-language integration.
Parallel agent orchestration significantly reduces execution time.

Method

Kimi K2.5 utilizes pre-training with 15T vision-text tokens for native multimodality and employs Agent Swarm for parallel execution of up to 100 agents and 1,500 tool calls.

In practice

Use Kimi K2.5 for multimodal coding workflows.
Apply Agent Swarm for faster complex task execution.

Topics

Kimi K-2.5
Multimodal AI
Agentic Models
Microsoft Foundry
Vision-Language Models

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, AI Product Manager

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.