Gemma 4 Fine-Tuning on Single GPU | Training Gemma 4 With Hugging Face on Custom Dataset (πŸ”΄ Live)

Β· Source: Venelin Valkov Β· Field: Technology & Digital β€” Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics Β· Depth: Intermediate, extended

Summary

This content details a successful fine-tuning process for the Gemma 4 (2B parameter) model on a custom dataset, addressing issues encountered in a previous attempt. The author demonstrates replacing the `unswot` library with Hugging Face's `trl` library, specifically using `sft trainer`, and following Google's recommended pipeline. The fine-tuning was performed on a machine with 24 GB of VRAM, utilizing a custom dataset of 1,000 examples (900 for training, 100 for testing) structured as Slack messages requiring JSON output for priority and on-call page status. The process involved data preparation into a ShareGPT-like format, LoRA adapter configuration targeting all linear modules, and training for four epochs. Post-training evaluation showed a 99% accuracy on the test set, with the model successfully generating valid JSON outputs, a significant improvement over the untrained base model. The content also covers model saving, merging the LoRA adapter with the base model, and a brief discussion on quantization for inference optimization.

Key takeaway

For AI Engineers and ML Students looking to fine-tune Gemma 4 for structured output tasks, prioritize using Hugging Face's `trl` library with `sft trainer` and LoRA adapters. Ensure your custom dataset is well-formatted (e.g., ShareGPT-like) and sufficiently large (at least 1,000 examples) with balanced representation across categories. This approach can yield high accuracy and enable deployment on GPUs with 24GB VRAM, even for models with 5 billion parameters.

Key insights

Fine-tuning Gemma 4 with Hugging Face's `trl` library and LoRA adapters significantly improves structured JSON output accuracy.

Principles

Method

The method involves preparing a custom dataset into a ShareGPT-like format, configuring LoRA adapters targeting all linear modules, and training the Gemma 4 base model using Hugging Face's `sft trainer` for four epochs on a 24GB VRAM GPU.

In practice

Topics

Best for: Machine Learning Engineer, AI Engineer, AI Student

Related on AIssential

Open in AIssential β†’

Editorial summary, takeaway, and curation by AIssential. Original article published by Venelin Valkov.