Preference Fine-Tuning LFM 2 Using DPO

· Source: Analytics Vidhya · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Intermediate, long

Summary

Liquid Foundation Models (LFM 2) are a new class of small language models (SLMs) optimized for edge devices, offering strong reasoning and instruction-following with high efficiency and low latency. Available in 350M, 700M, 1.2B, and 2.6B parameter sizes, LFM 2 models support a 32,768-token context window. The architecture combines multiplicative gated short-range convolutional layers with grouped query attention blocks, identified via hardware-in-the-loop search for CPU and embedded accelerator efficiency. LFM 2 achieves up to 2x faster prefill and decode speeds on CPU and 3x more efficient training than predecessors, outperforming many similarly sized models on benchmarks like GSM8K and IFEval. The LFM 2 family also extends to multimodal applications like vision-language and audio. This article details fine-tuning the LFM2-700M model using Direct Preference Optimization (DPO) and LoRA.

Key takeaway

For AI Engineers and Data Scientists developing on-device language model applications, LFM 2 combined with Direct Preference Optimization (DPO) offers a practical path to deploy high-performing, aligned SLMs. You should consider this approach to achieve competitive reasoning and instruction-following capabilities on constrained hardware without the complexity and computational cost of traditional RLHF pipelines. This enables efficient customization and deployment.

Key insights

LFM 2 models combine efficient architecture with DPO fine-tuning for high-performance, edge-deployable SLMs.

Principles

Method

Fine-tune LFM2-700M with DPO using the mlabonne/orpo-dpo-mix-40k dataset, applying LoRA for parameter efficiency, then merge and save the fine-tuned model for inference.

In practice

Topics

Best for: Machine Learning Engineer, AI Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Analytics Vidhya.