Running local models is good now

· Source: ✰Vicki Boykis✰ · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, medium

Summary

Local large language models (LLMs) have significantly improved, making them surprisingly effective for development tasks as of June 15, 2026. The author, using a 2022 M2 Mac with 64 GB RAM, details successful experiences with models like Mistral 7B, Gemma 3, OpenAI OSS-20B, and Qwen 3 MOE, leveraging tools such as llama.cpp, Ollama, and LM Studio. While early local models lagged, the release of GPT-OSS marked a turning point, with Google's Gemma 4 family, particularly "gemma-4-26b-a4b" and "gemma-4-12b-qat", now enabling agentic coding locally at approximately 75% the accuracy and speed of frontier models. Practical applications include refactoring Python scripts, linting, writing unit tests, and bootstrapping recommendation models. The article also outlines a secure Dockerized setup for running agentic workflows with Pi and LM Studio, highlighting the benefits of deep introspection into model performance.

Key takeaway

For AI Engineers evaluating LLM deployment strategies, local models like Gemma 4 now offer compelling capabilities for agentic coding and development tasks. You can achieve approximately 75% of frontier model performance for refactoring or testing, while gaining full introspection into token processing. Consider setting up a Dockerized environment with LM Studio and Pi to experiment with "gemma-4-12b-qat" for secure, personalized development workflows. This approach allows deep customization and performance tuning.

Key insights

Local LLMs have matured significantly, now enabling effective agentic coding and deep model introspection.

Principles

Method

Set up a local inference engine (e.g., LM Studio) with an agentic harness (e.g., Pi) and a downloaded model artifact. Configure the harness to point to the local endpoint.

In practice

Topics

Code references

Best for: Machine Learning Engineer, AI Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ✰Vicki Boykis✰.