🤖AI Agents Weekly: GPT-5.3-Codex-Spark, GLM-5, MiniMax M2.5, Recursive Language Models, Harness Engineering, Agentica, and More
Summary
OpenAI has released GPT-5.3-Codex-Spark, an advanced agentic coding model that demonstrates frontier coding performance, reasoning, and professional knowledge capabilities, operating 25% faster than its predecessor. Notably, this model was instrumental in its own development, with early versions used by the Codex team for debugging, deployment management, and evaluation diagnostics. Beyond coding, it handles professional knowledge-work outputs like presentations and spreadsheets, winning or tying in 70.9% of evaluations on the GDPval benchmark. OpenAI has rated it as their first model with "high" cybersecurity capability under their Preparedness Framework, indicating potential for real-world cyber harm, and has responded with a $10M API credits program for cyber defense research. Concurrently, Zhipu AI launched GLM-5, a 744B-parameter Mixture-of-Experts (MoE) model with 40B active parameters, designed for agentic intelligence and multi-step reasoning. Trained entirely on Huawei Ascend chips using the MindSpore framework, GLM-5 offers full independence from US semiconductor hardware. It features a native Agent Mode for autonomous task decomposition and can transform prompts into professional documents. GLM-5 ingested 28.5 trillion tokens during pre-training, a 23.9% increase over GLM-4.7, and uses a novel RL technique for low hallucination rates. Released under an MIT license with open weights, it is available on OpenRouter at approximately $0.80 per million input tokens and $2.56 per million output tokens, making it significantly more affordable than comparable proprietary models.
Key takeaway
For CTOs and VPs of Engineering evaluating next-generation AI models, consider the implications of self-developing models like GPT-5.3-Codex-Spark for internal tooling and the cybersecurity risks they present. Simultaneously, Zhipu AI's GLM-5 offers a compelling, open-source, and hardware-independent alternative for agentic workflows, potentially reducing operational costs significantly. Your teams should assess these models for their specific use cases, balancing advanced capabilities with cost-efficiency and security considerations.
Key insights
New agentic coding and reasoning models demonstrate self-development capabilities and significant performance gains.
Principles
- Self-developing models can manage their own training and deployment.
- Agentic intelligence enhances autonomous task decomposition.
Method
GPT-5.3-Codex-Spark utilized early versions of itself for debugging and deployment. GLM-5 employs a novel RL technique for hallucination reduction and native Agent Mode for task decomposition.
In practice
- Explore GPT-5.3-Codex-Spark for advanced coding and knowledge work.
- Consider GLM-5 for cost-effective, agentic multi-step reasoning.
- Investigate cyber defense research with OpenAI's API credits.
Topics
- Large Language Models
- AI Agents
- Code Generation
- Model Training
- Open-Source AI
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Newsletter.