SKILL.md convert to LoRA Adapters (from Harness to CORE)

2026-06-22 · Source: Discover AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, extended

Summary

A new methodology, Skill to LoRA (S2L) adapters, integrates procedural knowledge from "skill.md" files directly into Large Language Models (LLMs) as parametric knowledge, reducing token costs and repeated context injection. Developed by the Chinese University of Hong Kong, S2L uses a two-phase process. An offline phase employs two LLMs to generate 64 synthetic input/output training examples per skill. The training phase then fine-tunes a small, 6.03 million parameter LoRA adapter (0.02% of the base Qwen 3.6 27B model) using 4-bit QLoRA. Benchmarking on a software development skill set showed S2L improved pass rates by 2.9-5.2% and reduced token costs by 6.6%. The study also highlighted that single-skill LoRA adapters outperform shared adapters due to interference in low-rank subspaces.

Key takeaway

AI Engineers integrating specific procedural skills into LLM agents should consider adopting the Skill to LoRA (S2L) methodology. This approach embeds skills directly into your LLM's parametric knowledge, significantly reducing token costs and context window pollution compared to runtime skill injection. Prioritize training individual LoRA adapters for each skill, as shared adapters can lead to performance degradation due to conflicting behavioral patterns. Explore generating synthetic training data for QLoRA fine-tuning.

Key insights

Skill to LoRA adapters parametrically integrate procedural knowledge into LLMs, reducing token costs and improving performance.

Principles

Parametric skill integration reduces LLM context window overhead.
Single-skill LoRA adapters prevent destructive interference in low-rank spaces.
Synthetic data generation can efficiently create LoRA training sets.

Method

The S2L method involves an offline phase where two LLMs generate 64 synthetic input/output pairs per skill, followed by a training phase using 4-bit QLoRA to fine-tune a rank 16 adapter on a frozen base LLM.

In practice

Use 4-bit QLoRA with rank 16 for efficient skill integration.
Generate synthetic training data with LLMs for LoRA fine-tuning.
Avoid shared LoRA adapters for multiple, potentially conflicting skills.

Topics

Skill to LoRA (S2L)
LoRA Adapters
QLoRA Fine-tuning
LLM Skill Integration
Synthetic Data Generation
Qwen 3.6 27B
Agentic LLMs

Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.