IBM Granite 3.0 models

2024-10-20 · Source: Ollama Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, quick

Summary

IBM has released a selection of its Granite 3.0 models, now available for deployment via Ollama under an Apache 2.0 license as of October 21, 2024. The Granite 3.0 series includes both dense and Mixture of Expert (MoE) architectures. The text-only dense LLMs, Granite 2B and Granite 8B, were trained on over 12 trillion tokens and show performance comparable to Llama 3.1 8B Instruct on OpenLLM Leaderboard v1 and v2 benchmarks. These dense models are optimized for tool-based use cases, RAG, code generation, translation, and bug fixing. Additionally, IBM introduced Granite 1B MoE and Granite 3B MoE, trained on over 10 trillion tokens, specifically designed for low-latency, on-device, and instantaneous inference applications.

Key takeaway

For MLOps Engineers evaluating new open-source LLMs for deployment, consider the IBM Granite 3.0 models available through Ollama. The Granite 8B Instruct model offers performance on par with Llama 3.1 8B Instruct for general tasks, while the MoE variants (1B and 3B) are specifically engineered for low-latency, on-device inference, making them suitable for edge computing or real-time applications where speed is critical.

Key insights

IBM's Granite 3.0 models, including dense and MoE variants, are now available via Ollama under Apache 2.0.

Principles

MoE models excel in low-latency inference scenarios.
Dense LLMs support tool-based use cases and RAG.

In practice

Use `ollama run granite3-dense:8b` for dense 8B model.
Deploy Granite MoE for on-device applications.

Topics

IBM Granite 3.0
Large Language Models
Mixture-of-Experts
Retrieval-Augmented Generation
Ollama Integration

Best for: AI Architect, MLOps Engineer, NLP Engineer, AI Engineer, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Ollama Blog.