Skip to main content
Fine‑tune a Gemma model on your domain data with a simple, repeatable flow. You’ll pick a base model, choose a training method (SFT, DPO/ORPO, or GRPO), and run efficiently with Full, LoRA, or QLoRA depending on resources. Vision variants support text+image.
1

Prepare your dataset

Process data into conversation or preference format using the dataset preprocessing guide.
Verify a small sample looks correct before training.
2

Select base model and method

Pick a Gemma size that fits your budget, then choose SFT (supervised), DPO/ORPO (preference), or GRPO (reasoning with rewards).
3

Enable PEFT and quantization if needed

Start with QLoRA for strong results on modest hardware; use Full finetune only when you need maximal capacity.
4

Launch and monitor

Start the job and watch training/validation signals to catch issues early.
5

Evaluate and iterate

Test the model, compare to the baseline, and iterate on data or settings.
Next: open the Training guide for step‑by‑step configuration details.