OpenAI compatibility
Summary
Ollama, a platform for running large language models locally, announced built-in compatibility with the OpenAI Chat Completions API on February 8, 2024. This integration allows users to leverage existing OpenAI tooling and applications with local Ollama models like Llama 2 or Mistral. Setup involves downloading Ollama and pulling a model, then invoking the API endpoint by changing the hostname to `http://localhost:11434` in cURL requests. The compatibility extends to the OpenAI Python and JavaScript libraries, requiring only a `base_url` modification and a placeholder `api_key`. Practical examples demonstrate integration with the Vercel AI SDK for conversational streaming applications and Microsoft's Autogen framework for multi-agent systems, using models like Code Llama. Future enhancements may include Embeddings API, function calling, vision support, and logprobs.
Key takeaway
For AI Engineers building or testing applications that rely on the OpenAI Chat Completions API, you can now seamlessly switch to local Ollama models like Llama 2 or Mistral by simply reconfiguring the API base URL. This enables local development and testing without external API calls, reducing costs and improving privacy, especially when iterating on multi-agent systems or conversational UIs.
Key insights
Ollama now supports the OpenAI Chat Completions API, enabling local LLM use with existing OpenAI-compatible tools.
Principles
- Local LLM inference can integrate with standard API formats.
- API compatibility expands tooling ecosystems.
Method
Configure OpenAI client libraries or cURL requests to point to `http://localhost:11434/v1` and use 'ollama' as the API key to interact with local Ollama models.
In practice
- Use Ollama with Vercel AI SDK for local streaming apps.
- Integrate Ollama models into Autogen multi-agent systems.
Topics
- Ollama
- OpenAI API Compatibility
- Local LLM Deployment
- Chat Completions API
- AI Development Frameworks
Code references
Best for: AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Ollama Blog.