Run Code Llama locally
Summary
Meta Platforms, Inc. released Code Llama on August 24, 2023, an open-source code generation model built upon Llama 2. This model offers strong performance for programming tasks, including infilling, large input context support, and zero-shot instruction following. Code Llama is now accessible via Ollama, a platform that allows users to run the model locally. It is available in three parameter sizes: 7 billion, 13 billion (requiring 16GB+ memory), and 34 billion (requiring 32GB+ memory). Users can also access specialized versions, including foundation models for general code tasks and Python-specific models, by using simple `ollama run` or `ollama pull` commands.
Key takeaway
For AI Engineers or developers seeking local code generation capabilities, Code Llama offers a powerful open-source solution. You can easily deploy and experiment with its 7 billion, 13 billion, or 34 billion parameter versions using Ollama, including specialized Python models. Consider your hardware's memory capacity (16GB+ for 13B, 32GB+ for 34B) when selecting a model size to ensure optimal local performance for your programming tasks.
Key insights
Code Llama, based on Llama 2, provides open-source code generation with infilling and large context support.
Principles
- Open models can achieve strong performance.
- Specialized models enhance task-specific capabilities.
Method
Users can run Code Llama locally via Ollama by executing `ollama run codellama:[size]` or `ollama pull codellama:[size]-[specialization]` commands after installing Ollama.
In practice
- Run 7B Code Llama with `ollama run codellama:7b`.
- Access Python-specific models with `ollama pull codellama:7b-python`.
Topics
- Code Llama
- Code Generation
- Large Language Models
- Ollama
- Python Programming
Code references
Best for: AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Ollama Blog.