This Week in Open Models: Tiny LFM2.5, Ornith-1.0, and GLM-5.2 REAP
Summary
This week's open model releases include Liquid AI's LFM2.5-230M, a compact 230M-parameter model demonstrating strong instruction-following capabilities on IFBench and IFEval, nearing the performance of Llama 3.1 8B despite a 35x parameter gap. It features 14 layers and a 2560 FFN dimension, consuming only 459 MB. DeepReinforce also launched Ornith-1.0, a family of agentic coding models (9B Dense, 35B MoE, 397B MoE) based on Gemma 4 and Qwen 3.5. Ornith-1.0 employs a "self-scaffolding" training approach, allowing the model to iteratively improve its own problem-solving orchestration. Additionally, 0xSero introduced GLM-5.2 REAP, a 504B version of Z.ai's 753B GLM-5.2 MoE. This version prunes 88 of 256 routed experts per layer (34.4%) and uses Router-KD to retrain only the router gates, aiming for easier deployment, though it may degrade world knowledge and increase inference costs.
Key takeaway
For MLOps Engineers or AI Scientists deploying large language models, you should evaluate smaller models like LFM2.5-230M for instruction-following tasks, as they offer significant memory savings with competitive performance. If you are developing agentic coding solutions, explore Ornith-1.0's self-scaffolding approach to potentially enhance model autonomy and problem-solving. Be cautious with MoE pruning methods like GLM-5.2 REAP; while they reduce model size, you must carefully assess potential degradation in world knowledge and increased inference token generation.
Key insights
Small models can achieve disproportionate instruction-following performance, while advanced training methods enable self-improving agentic coding and efficient MoE pruning.
Principles
- Parameter reduction can maintain performance.
- Self-scaffolding improves agentic model training.
- Targeted pruning reduces MoE size.
Method
Ornith-1.0 uses reinforcement learning to train models to propose and refine task-specific scaffolds, then generate solutions, with rewards assigned to both scaffold and solution steps. GLM-5.2 REAP prunes low-saliency MoE experts and applies Router-KD to retrain only router gates.
In practice
- Evaluate 230M models for instruction following.
- Explore self-scaffolding for agentic tasks.
- Consider expert pruning for MoE deployment.
Topics
- Open Models
- Mixture-of-Experts
- Agentic AI
- Model Pruning
- Instruction Following
- Knowledge Distillation
Best for: AI Engineer, NLP Engineer, AI Architect, AI Scientist, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Kaitchup – AI on a Budget.