Now in Foundry: Microsoft Harrier and NVIDIA EGM-8B
Summary
Microsoft Foundry now supports two new efficient AI models: Microsoft Research's harrier-oss-v1-0.6b and NVIDIA's EGM-8B. The harrier-oss-v1-0.6b, a 0.6B parameter decoder-only model, achieves a 69.0 score on the Multilingual MTEB v2 leaderboard for text embeddings, utilizing contrastive learning and knowledge distillation. It supports over 100 languages and handles six embedding scenarios with task-instruction queries. NVIDIA's EGM-8B, an ~8.8B parameter Vision Language Model, scores 91.4 average IoU on the RefCOCO visual grounding benchmark, showing a +3.6 IoU gain over its base model through two-stage training with Supervised Fine-Tuning and Group Relative Policy Optimization. Both models demonstrate that targeted training strategies can achieve performance comparable to much larger models, emphasizing efficiency-first development.
Key takeaway
For AI Architects and NLP Engineers evaluating model deployment, consider harrier-oss-v1-0.6b for multilingual embedding tasks and EGM-8B for visual grounding. These models offer high performance at reduced parameter counts, allowing for more efficient resource utilization and faster inference, which can significantly lower operational costs and improve latency in production environments. Explore their one-click deployment options in Microsoft Foundry.
Key insights
Targeted training strategies enable smaller models to achieve performance comparable to larger, more resource-intensive counterparts.
Principles
- Efficiency-first model development narrows the small-large model gap.
- Decoder-only architecture can excel in embedding tasks.
- Reinforcement Learning improves visual grounding accuracy.
Method
harrier-oss-v1-0.6b uses contrastive learning and knowledge distillation. EGM-8B employs two-stage training: SFT on chain-of-thought traces followed by GRPO with an IoU/task success reward.
In practice
- Use harrier-oss-v1-0.6b for multilingual semantic search.
- Apply EGM-8B for object localization in images.
- Deploy models via Microsoft Foundry for scalable inference.
Topics
- Microsoft Harrier
- NVIDIA EGM-8B
- Multilingual MTEB v2
- Visual Grounding
- Knowledge Distillation
Best for: AI Architect, NLP Engineer, Computer Vision Engineer, AI Engineer, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.