I Fine-Tuned an LLM to Reply Like My Wife — 10 Years of Chats, One Adapter, and Results That Amazed…
Summary
A backend engineer fine-tuned a Qwen3-14B-bnb-4bit large language model (LLM) using a decade of personal WhatsApp chat data, totaling 3.5 MB of text, to generate responses in his wife's conversational style. The project, executed on consumer hardware (RTX 3080 Ti with 16 GB VRAM) using LoRA (Low-Rank Adaptation), aimed to understand behavioral changes in LLMs post-fine-tuning. The engineer meticulously cleaned the multilingual dataset, which included English, Marathi, and Hindi, by parsing complex WhatsApp export formats, removing system noise, and downweighting common one-word replies. After 4,500 training steps, the model demonstrated persona learning, generating novel responses that captured specific speech patterns and cautious tone, rather than mere memorization, despite challenges with Windows + CUDA dependency management and GGUF conversion.
Key takeaway
For AI Engineers and Machine Learning Engineers considering persona fine-tuning on consumer-grade hardware, prioritize meticulous data cleaning and aggressive version pinning for your environment. Do not solely rely on loss metrics; instead, define and test against qualitative "sound right" criteria early in your training process. Be prepared for GGUF/Ollama conversion hurdles on Windows, and validate your model's learned behavior through direct inference before investing heavily in deployment packaging.
Key insights
Fine-tuning LLMs with personal, messy data on consumer hardware can yield surprising persona learning.
Principles
- Data quality is paramount for effective fine-tuning.
- LoRA enables persona adaptation on limited hardware.
- Qualitative evaluation often surpasses quantitative loss metrics.
Method
The method involved extracting and cleaning a multilingual WhatsApp chat export, applying LoRA to a 4-bit quantized Qwen3-14B model using Unsloth and TRL, and iteratively testing inference for persona alignment.
In practice
- Use Linux or WSL2 for LLM fine-tuning to avoid Windows issues.
- Aggressively pin software versions for stability.
- Validate model behavior via direct inference before tackling export formats.
Topics
- LLM Fine-tuning
- LoRA Adaptation
- Multilingual Chat Data
- Consumer GPU Training
- Data Preprocessing
Best for: AI Engineer, Machine Learning Engineer, AI Student
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by LLM on Medium.