Build Your Own Cursor This Weekend. Yes, the One SpaceX Just Paid $60 Billion For.
Summary
Cursor's in-house coding model, Composer 2, acquired by SpaceX for \$60 billion in June 2026, was built starting from Moonshot AI's open-weight Kimi K2.5 checkpoint. This demonstrates that a "frontier-level" coding assistant can be constructed primarily through integrating open-source components: a Visual Studio Code fork for the editor, a local inference server, and open-weight models. The core architecture involves a two-model setup: a small, fast Fill-in-the-Middle (FIM) capable model like Mistral's Codestral (22 billion parameters, ~95% FIM accuracy) for real-time autocomplete, and a larger model such as Qwen3-Coder-30B (30 billion parameters, fitting in ~19GB on a 24GB GPU) for chat and agentic tasks. This approach enables developers to build a functional, local AI coding assistant, processing tokens on their own hardware without external servers.
Key takeaway
For AI Engineers or ML Engineers building internal coding tools, you can now deploy a powerful, local AI assistant without proprietary models. Utilize open-source components like VS Code and Continue.dev, pairing a FIM-trained model (e.g., Codestral) for autocomplete with a larger model (e.g., Qwen3-Coder-30B) for chat. This approach keeps your code on-premises and eliminates per-token costs, offering a robust alternative to cloud-based solutions.
Key insights
A frontier coding assistant can be built by integrating open-source components and open-weight models, utilizing a two-model architecture.
Principles
- Start with open-weight model checkpoints.
- Separate models for autocomplete and chat.
- Aggressively gather relevant code context.
Method
Build a coding assistant by integrating VS Code with a Continue.dev extension, serving a FIM-trained autocomplete model (e.g., Codestral via Ollama) and a larger chat/agent model (e.g., Qwen3-Coder-30B via vLLM) locally.
In practice
- Deploy Codestral for FIM autocomplete.
- Use Qwen3-Coder-30B on 24GB GPUs.
- Serve models with Ollama or vLLM.
Topics
- AI Coding Assistants
- Open-weight LLMs
- Local LLM Deployment
- Fill-in-the-Middle
- VS Code Extensions
- Model Quantization
Code references
Best for: AI Engineer, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.