Cloud models

· Source: Ollama Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, quick

Summary

Ollama has launched a preview of its new cloud models, enabling users to run larger language models that typically exceed personal computer hardware capabilities. This service, released on September 19, 2025, integrates seamlessly with existing local Ollama tools and maintains user privacy by not retaining data. The cloud models are accessible via Ollama's OpenAI-compatible API and support standard Ollama commands like `run`, `pull`, `ls`, and `cp`. Initial available models include `qwen3-coder:480b-cloud`, `gpt-oss:120b-cloud`, `gpt-oss:20b-cloud`, and `deepseek-v3.1:671b-cloud`. Users need to download Ollama v0.12 and sign in to ollama.com to utilize these datacenter-grade hardware resources.

Key takeaway

For AI Engineers and Machine Learning Engineers needing to deploy or experiment with large language models, Ollama's new cloud models offer a practical solution. You can now access models up to 671B parameters without local hardware constraints, while preserving your existing Ollama workflows and ensuring data privacy. Consider integrating these cloud models to scale your LLM applications or research efforts efficiently.

Key insights

Ollama's cloud models allow running large language models with local tools and API compatibility, ensuring data privacy.

Principles

Method

Download Ollama v0.12, sign in via `ollama signin`, then use `ollama run` or `ollama pull` for cloud models, integrating with existing Ollama CLI and API workflows.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Ollama Blog.