PEFT of SLM for Telecommunications Customer Support: A Comparative Study of LoRA Configurations with Energy Consumption Analysis

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A systematic study investigates parameter-efficient fine-tuning (PEFT) using Low-Rank Adaptation (LoRA) on Qwen2.5-3B to develop a domain-specific conversational assistant for telecommunications customer support. The methodology introduces a combinatorial approach to synthetic dataset generation, leveraging a glossary of 52 industry-specific terms to produce approximately 30,000 training examples covering 1,560 distinct problem scenarios via a generative pipeline using Gemini 2.0 Flash. Researchers conducted a comprehensive empirical evaluation of 16 distinct LoRA configurations, systematically varying hyperparameters and target modules. Critically, the study extended traditional performance metrics to include energy consumption analysis (284-1371 Wh, a 5x variation) and qualitative evaluation using LLM-as-a-judge methodology with GPT-5.2 and Claude 4.5 Sonnet. Findings reveal a striking divergence: the fine-tuned configuration with the lowest validation loss (0.5024) ranks 6th-7th qualitatively, while the configuration with the highest validation loss (0.6807) ranks 1st by both human-aligned judges. This highlights the insufficiency of validation loss alone for selecting conversational AI models.

Key takeaway

For MLOps Engineers deploying domain-specific conversational AI, relying solely on validation loss for model selection is misleading. Your fine-tuning configuration with the lowest loss might not deliver the best perceived conversational quality. You should integrate LLM-as-a-judge evaluations (e.g., with GPT-5.2 or Claude 4.5 Sonnet) and energy consumption analysis into your selection pipeline. Prioritize configurations that balance qualitative performance and energy efficiency, like configuration 4 or 8, over those merely minimizing loss.

Key insights

Validation loss alone is insufficient for selecting conversational AI models; qualitative evaluation is crucial.

Principles

Broader LoRA target module coverage reduces loss more than higher rank.
Lower LoRA ranks (e.g., r=16) can outperform higher ranks (r=32).
Fast convergence can reduce total training energy despite higher per-step cost.

Method

Generate synthetic data by factorizing domain knowledge (terms, causes, contexts) and expanding with an LLM (Gemini 2.0 Flash).

In practice

Use r=16 for 3B-parameter model conversational fine-tuning.
Target attention modules for simple datasets, add FFN for complex ones.
Incorporate LLM-as-a-judge and energy measurements.

Topics

Parameter-Efficient Fine-Tuning
LoRA Configurations
Telecommunications AI
Synthetic Data Generation
LLM-as-a-Judge
Energy Efficiency

Code references

Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.