This Week in Open Models: Tiny LFM2.5, Ornith-1.0, and GLM-5.2 REAP

2026-04-15 · Source: The Kaitchup – AI on a Budget · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, medium

Summary

This week's open model releases include Liquid AI's LFM2.5-230M, a compact 230M-parameter model demonstrating strong instruction-following capabilities on IFBench and IFEval, nearing the performance of Llama 3.1 8B despite a 35x parameter gap. It features 14 layers and a 2560 FFN dimension, consuming only 459 MB. DeepReinforce also launched Ornith-1.0, a family of agentic coding models (9B Dense, 35B MoE, 397B MoE) based on Gemma 4 and Qwen 3.5. Ornith-1.0 employs a "self-scaffolding" training approach, allowing the model to iteratively improve its own problem-solving orchestration. Additionally, 0xSero introduced GLM-5.2 REAP, a 504B version of Z.ai's 753B GLM-5.2 MoE. This version prunes 88 of 256 routed experts per layer (34.4%) and uses Router-KD to retrain only the router gates, aiming for easier deployment, though it may degrade world knowledge and increase inference costs.

Key takeaway

For MLOps Engineers or AI Scientists deploying large language models, you should evaluate smaller models like LFM2.5-230M for instruction-following tasks, as they offer significant memory savings with competitive performance. If you are developing agentic coding solutions, explore Ornith-1.0's self-scaffolding approach to potentially enhance model autonomy and problem-solving. Be cautious with MoE pruning methods like GLM-5.2 REAP; while they reduce model size, you must carefully assess potential degradation in world knowledge and increased inference token generation.

Key insights

Small models can achieve disproportionate instruction-following performance, while advanced training methods enable self-improving agentic coding and efficient MoE pruning.

Principles

Parameter reduction can maintain performance.
Self-scaffolding improves agentic model training.
Targeted pruning reduces MoE size.

Method

Ornith-1.0 uses reinforcement learning to train models to propose and refine task-specific scaffolds, then generate solutions, with rewards assigned to both scaffold and solution steps. GLM-5.2 REAP prunes low-saliency MoE experts and applies Router-KD to retrain only router gates.

In practice

Evaluate 230M models for instruction following.
Explore self-scaffolding for agentic tasks.
Consider expert pruning for MoE deployment.

Topics

Open Models
Mixture-of-Experts
Agentic AI
Model Pruning
Instruction Following
Knowledge Distillation

Best for: AI Engineer, NLP Engineer, AI Architect, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Kaitchup – AI on a Budget.