1
Pick your data source
Upload a file or import a Hugging Face dataset by repository name.
Supports text and image data for multimodal training.
2
Map fields to roles
Point columns/keys to conversation roles (system, user, assistant) or to chosen/rejected pairs for preference data.
3
Choose processing mode
Select Language Modeling, Prompt‑only, or Preference Tuning to shape the output format.
4
Preview and validate
Inspect a sample of the processed dataset and fix any mapping issues.
Confirm examples look correct before running the full job.
5
Process and version
Run the job to generate a training‑ready dataset you can reuse across experiments.