Agentic Engineering: WTF Happened in December 2025?
Summary
The AI landscape is rapidly evolving, marked by significant advancements in coding agents, open models, and infrastructure. Perplexity launched "Computer," an orchestration-first agent product designed to manage projects end-to-end using parallel sub-agents and usage-based pricing. Andrej Karpathy noted a "phase change" in coding agents since December, enabling sustained, long-horizon task completion. OpenAI released GPT-5.3-Codex, offering ~25% faster performance, while Claude Code matured with Slack integration and LangSmith tracing. Alibaba's Qwen3.5 Medium series saw widespread distribution, boasting near-lossless 4-bit quantization and long-context capabilities up to 1M+ tokens on consumer GPUs. Despite capability gains, agent reliability remains a concern, with failures often stemming from compounded small errors rather than core model limitations. Memory orchestration for LLM workflows, especially decode under long context, is a critical challenge, with diffusion LLMs emerging as a potential speed alternative. Major announcements include Anthropic's acquisition of Vercept and a shift in its Responsible Scaling Policy, alongside growing concerns about AI's energy consumption and surveillance implications.
Key takeaway
For NLP Engineers and CTOs evaluating AI integration, recognize that the shift towards sophisticated, multi-agent systems like Perplexity's "Computer" demands a focus on distributed workflow management and robust reliability testing. Prioritize models and frameworks that offer transparent cost controls and strong tool-use optimization, as these factors are becoming more critical than raw model capability for successful, long-horizon task completion. Your strategy should account for the increasing viability of local inference with models like Qwen3.5, balancing performance with hardware constraints and data privacy.
Key insights
AI agent capabilities are rapidly advancing, shifting focus to system-level orchestration, reliability, and efficient resource management.
Principles
- Agent reliability lags capability gains.
- Tool-interface text heavily influences agent success.
- Memory orchestration is key for LLM throughput.
Method
Perplexity's "Computer" orchestrates parallel, asynchronous sub-agents with a coordinator model, routing tasks to specialist models for research, coding, or media, emphasizing distributed workflow management.
In practice
- Utilize Qwen3.5-35B-A3B for local agentic coding on 32GB VRAM.
- Employ /ok alias in Aider for faster code modifications.
- Use OpenRouter to dynamically switch models for cost optimization.
Topics
- AI Agents
- Large Language Models
- Model Performance Benchmarking
- AI Infrastructure
- AI Ethics & Policy
Code references
- nousresearch/hermes-agent
- pytorch/pytorch
- ggerganov/llama.cpp
- EleutherAI/lm-evaluation-harness
- huggingface/agents-course
Best for: NLP Engineer, CTO, VP of Engineering/Data, AI Engineer, Machine Learning Engineer, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AINews.