Prepare your dataset
Process data into conversation or preference format using the dataset preprocessing guide.
Verify a small sample looks correct before training.
Select base model and method
Pick a Gemma size that fits your budget, then choose SFT (supervised), DPO/ORPO (preference), or GRPO (reasoning with rewards).
Enable PEFT and quantization if needed
Start with QLoRA for strong results on modest hardware; use Full finetune only when you need maximal capacity.