How to Run DeepSeek Locally on Your Own Computer, and the Catch Most Guides Skip

2026-06-24 · Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Novice, long

Summary

DeepSeek, an open model released in January 2025, gained prominence for matching advanced reasoning systems at a fraction of their cost. While the full, hundreds-of-billions-of-parameters Mixture-of-Experts model requires data-center hardware, users can run distilled versions locally on personal computers. These smaller models, ranging from 1.5 billion parameters (~2 GB RAM) to 70 billion parameters (~40 GB VRAM), retain significant reasoning ability. The process is streamlined using Ollama, involving installation and a single command like "ollama run deepseek-r1:8b". A key feature is DeepSeek's "visible thinking," where it displays its reasoning steps before providing an answer. Users should adjust context length to 16384, avoid instructing it to "think step by step," and set temperature to 0.6 for optimal performance. Local execution ensures complete data privacy, as nothing leaves the machine.

Key takeaway

For AI Engineers or ML Students seeking to run advanced reasoning models privately, you should prioritize DeepSeek's distilled versions over the full model. Select the largest "deepseek-r1" variant your hardware comfortably supports (e.g., 8b for laptops, 32b for RTX 3090/4090). Configure a context length of 16384 and a temperature of 0.6 via Ollama to maximize reasoning quality, ensuring you benefit from its visible thinking without data leaving your machine.

Key insights

DeepSeek's powerful reasoning is accessible locally via distilled models, not the enormous full version, using tools like Ollama.

Principles

Model distillation effectively transfers complex reasoning to smaller packages.
Hardware compatibility dictates optimal local LLM performance.
Transparent reasoning steps improve answer verification and debugging.

Method

Install Ollama, then use "ollama run [model_name]" in a terminal. Configure context length ("/set parameter num_ctx 16384") and temperature ("/set parameter temperature 0.6") within the chat session.

In practice

Deploy "deepseek-r1:8b" for general laptop use with 8GB RAM.
Increase context length to 16384 to prevent truncated reasoning.
Parse output to remove "think" tags for structured data processing.

Topics

DeepSeek
Local LLM Deployment
Ollama
Model Distillation
LLM Reasoning
Hardware Requirements

Best for: AI Engineer, Machine Learning Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.