Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'
Summary
Liquid AI, founded by former MIT computer scientists, released its LFM2.5-230M language model on June 25, 2026. This 230-million-parameter foundation model is explicitly designed for on-device agentic workflows and local deployment on smartphones, laptops, and robotics. It reportedly outperforms models over 4X its size, such as the 800-million-parameter Alibaba Qwen3.5-0.8B and 1-billion-parameter Google Gemma 3 1B, on data extraction and tool-use benchmarks like BFCLv3 and CaseReportBench. The LFM2.5-230M utilizes a unique LFM2 hybrid architecture, combining gated short-range convolutions with grouped-query attention, enabling a 32K context window and maintaining a memory footprint under 400MB. It achieves decode speeds of 213 tokens per second on a Samsung Galaxy S25 Ultra and 42 tokens per second on a Raspberry Pi 5. The model is available under a dual-use commercial license, free for entities with under \$10 million in annual revenue, requiring a paid agreement for larger enterprises.
Key takeaway
For AI Engineers or Directors of AI/ML evaluating on-device AI solutions, Liquid AI's LFM2.5-230M presents a compelling option. If your team needs to automate data extraction or deploy agentic workflows on edge hardware, this 230-million-parameter model offers superior performance for its size, significantly reducing cloud compute costs and latency. You should assess its capabilities for local deployment on smartphones, robotics, or other constrained environments to streamline operations.
Key insights
Liquid AI's LFM2.5-230M demonstrates that highly efficient, small models can surpass larger ones for specific on-device data extraction and tool-use tasks.
Principles
- Architectural efficiency is key for edge AI performance.
- On-device deployment reduces cloud costs and latency.
- Specialized small models optimize targeted workflows.
Method
The LFM2.5-230M model employs a hybrid LFM2 architecture, interleaving gated short-range convolutions with grouped-query attention to process information efficiently with a 32K context window.
In practice
- Implement lightweight data extraction pipelines.
- Enable autonomous edge systems on constrained hardware.
- Structure diverse unstructured data into JSON formats.
Topics
- Liquid AI
- LFM2.5-230M
- Edge AI
- Data Extraction
- Agentic Workflows
- Small Language Models
Best for: AI Architect, NLP Engineer, CTO, AI Engineer, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.