Preference Fine-Tuning LFM 2 Using DPO

2026-01-02 · Source: Analytics Vidhya · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Intermediate, long

Summary

Liquid Foundation Models (LFM 2) are a new class of small language models (SLMs) optimized for edge devices, offering strong reasoning and instruction-following with high efficiency and low latency. Available in 350M, 700M, 1.2B, and 2.6B parameter sizes, LFM 2 models support a 32,768-token context window. The architecture combines multiplicative gated short-range convolutional layers with grouped query attention blocks, identified via hardware-in-the-loop search for CPU and embedded accelerator efficiency. LFM 2 achieves up to 2x faster prefill and decode speeds on CPU and 3x more efficient training than predecessors, outperforming many similarly sized models on benchmarks like GSM8K and IFEval. The LFM 2 family also extends to multimodal applications like vision-language and audio. This article details fine-tuning the LFM2-700M model using Direct Preference Optimization (DPO) and LoRA.

Key takeaway

For AI Engineers and Data Scientists developing on-device language model applications, LFM 2 combined with Direct Preference Optimization (DPO) offers a practical path to deploy high-performing, aligned SLMs. You should consider this approach to achieve competitive reasoning and instruction-following capabilities on constrained hardware without the complexity and computational cost of traditional RLHF pipelines. This enables efficient customization and deployment.

Key insights

LFM 2 models combine efficient architecture with DPO fine-tuning for high-performance, edge-deployable SLMs.

Principles

Hybrid architectures can optimize SLMs for edge devices.
DPO simplifies model alignment compared to RLHF.
LoRA enables efficient fine-tuning on limited hardware.

Method

Fine-tune LFM2-700M with DPO using the mlabonne/orpo-dpo-mix-40k dataset, applying LoRA for parameter efficiency, then merge and save the fine-tuned model for inference.

In practice

Use LFM 2 for on-device AI applications requiring low latency.
Apply DPO with LoRA for efficient SLM alignment.
Experiment with different LFM 2 variants and preference datasets.

Topics

Liquid Foundation Models (LFM 2)
Small Language Models
Direct Preference Optimization
Parameter-Efficient Fine-Tuning
Edge AI

Best for: Machine Learning Engineer, AI Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Analytics Vidhya.