Gemma 4 Fine-Tuning on Single GPU | Training Gemma 4 With Hugging Face on Custom Dataset (π΄ Live)
Summary
This content details a successful fine-tuning process for the Gemma 4 (2B parameter) model on a custom dataset, addressing issues encountered in a previous attempt. The author demonstrates replacing the `unswot` library with Hugging Face's `trl` library, specifically using `sft trainer`, and following Google's recommended pipeline. The fine-tuning was performed on a machine with 24 GB of VRAM, utilizing a custom dataset of 1,000 examples (900 for training, 100 for testing) structured as Slack messages requiring JSON output for priority and on-call page status. The process involved data preparation into a ShareGPT-like format, LoRA adapter configuration targeting all linear modules, and training for four epochs. Post-training evaluation showed a 99% accuracy on the test set, with the model successfully generating valid JSON outputs, a significant improvement over the untrained base model. The content also covers model saving, merging the LoRA adapter with the base model, and a brief discussion on quantization for inference optimization.
Key takeaway
For AI Engineers and ML Students looking to fine-tune Gemma 4 for structured output tasks, prioritize using Hugging Face's `trl` library with `sft trainer` and LoRA adapters. Ensure your custom dataset is well-formatted (e.g., ShareGPT-like) and sufficiently large (at least 1,000 examples) with balanced representation across categories. This approach can yield high accuracy and enable deployment on GPUs with 24GB VRAM, even for models with 5 billion parameters.
Key insights
Fine-tuning Gemma 4 with Hugging Face's `trl` library and LoRA adapters significantly improves structured JSON output accuracy.
Principles
- Use `trl`'s `sft trainer` for Gemma 4 fine-tuning.
- LoRA adapters enable efficient training on smaller GPUs.
- A balanced dataset improves model performance.
Method
The method involves preparing a custom dataset into a ShareGPT-like format, configuring LoRA adapters targeting all linear modules, and training the Gemma 4 base model using Hugging Face's `sft trainer` for four epochs on a 24GB VRAM GPU.
In practice
- Target all linear modules for LoRA adapters.
- Aim for at least 1,000 balanced examples for fine-tuning.
- Merge LoRA adapter with base model for deployment.
Topics
- Gemma 4 Fine-tuning
- Hugging Face TRL Library
- LoRA Adapters
- Single GPU Training
- Custom Dataset
Best for: Machine Learning Engineer, AI Engineer, AI Student
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Venelin Valkov.