Latest open artifacts (#18): Arcee's 400B MoE, LiquidAI's underrated 1B model, new Kimi, and anticipation of a busy month

· Source: Interconnects AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, quick

Summary

January 2026 saw a slower pace of open model releases compared to the previous year, though several noteworthy models emerged amidst anticipation for upcoming releases like DeepSeek V4 and Claude Sonnet 5. LiquidAI released LFM2.5-1.2B-Instruct, an update that significantly improved performance after pretraining on 28T tokens, outperforming Qwen3 1.6B and nearing Qwen3 4B 2507 Instruct despite being over three times smaller. Arcee-ai introduced Trinity-Large-Preview, an ultra-sparse Mixture-of-Experts (MoE) model with 400B total and 13B active parameters, accompanied by a technical report and base models. Moonshotai's Kimi-K2.5, a multimodal model continually pretrained on 15T tokens, showed strong coding and agentic abilities, with some users replacing Claude 4.5 Opus for specific tasks, though its writing capabilities reportedly suffered. Other releases included zai-org's GLM-4.7-Flash and LLM360's K2-Think-V2, a truly open reasoning model.

Key takeaway

For NLP Engineers evaluating open-source models for deployment, consider LiquidAI's LFM2.5-1.2B-Instruct as a highly efficient option that rivals larger models in performance. Its small size and strong capabilities, particularly after extensive pretraining, make it a compelling choice for resource-constrained environments or applications requiring rapid inference. You should also investigate Arcee's Trinity-Large-Preview if you need a powerful, yet parameter-efficient, MoE model for specialized tasks.

Key insights

Smaller open models are achieving competitive performance against larger counterparts through extensive pretraining and architectural innovations.

Principles

Method

LiquidAI's LFM2.5-1.2B-Instruct achieved superior performance by continuing pretraining from 10T to 28T tokens, demonstrating the impact of extensive data on smaller models.

In practice

Topics

Code references

Best for: NLP Engineer, Computer Vision Engineer, Research Scientist, AI Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Interconnects AI.