Ollama is now available as an official Docker image

· Source: Ollama Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, quick

Summary

Ollama, a tool for running large language models locally, is now available as an official Docker-sponsored open-source image as of October 5, 2023. This release simplifies the process of deploying and interacting with LLMs using Docker containers, ensuring all model interactions occur locally without transmitting private data to third-party services. For Mac users, Ollama should be run as a standalone application due to Docker Desktop's lack of GPU support. On Linux, however, Ollama supports GPU acceleration within Docker containers for Nvidia GPUs, requiring the Nvidia container toolkit for setup. Users can run models like Llama 2 inside the container via a simple command after installation.

Key takeaway

For DevOps engineers and ML practitioners deploying large language models, the official Ollama Docker image simplifies local, private LLM inference. If you are on Linux with Nvidia GPUs, you can now easily set up GPU-accelerated LLMs within Docker, avoiding third-party data transmission. Mac users should continue using the standalone Ollama application for local GPU support.

Key insights

Ollama's official Docker image simplifies local LLM deployment with GPU acceleration on Linux.

Principles

Method

Install Nvidia container toolkit, then run the `ollama/ollama` Docker image with `--gpus=all` to enable GPU acceleration for local LLM inference on Linux.

In practice

Topics

Code references

Best for: Machine Learning Engineer, DevOps Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Ollama Blog.