Introduction
What Are Datasets?
Section titled “What Are Datasets?”FloTorch Datasets are collections of question-answer pairs used to evaluate and improve your AI models. They store your test data (ground truth) and examples in one place, so you can reuse them across evaluation projects and experiments.
Dataset Types
Section titled “Dataset Types”FloTorch supports two dataset types:
| Type | Purpose | Used For |
|---|---|---|
| RAG_EVALUATION | Evaluate RAG systems and pipelines | Testing retrieval accuracy, answer quality, comparing vector stores and embeddings. Requires a Knowledge Base. |
| MODEL_EVALUATION | Evaluate FloTorch model performance | Testing model accuracy, comparing versions, benchmarking configurations. No Knowledge Base required. |
How to Use Datasets in the Application
Section titled “How to Use Datasets in the Application”Step 1: Create a Dataset
Section titled “Step 1: Create a Dataset”- Go to your Workspace → Datasets
- Click Create Dataset
- Enter a name (lowercase letters, numbers, hyphens only)
- Optionally add a description
- Choose the type: RAG_EVALUATION or MODEL_EVALUATION
- Click Create
Step 2: Add Content to Your Dataset
Section titled “Step 2: Add Content to Your Dataset”After creating a dataset, add question-answer pairs. Choose one of these methods:
| Method | Available For | When to Use |
|---|---|---|
| Upload | Both types | You already have data in JSON/JSONL format |
| Manual Capture | Both types | You want to build Q&A pairs one by one with a model |
| Auto Capture | Both types | You want to collect real Q&A from Gateway traffic |
| Import from HuggingFace | Both types | You want to use a public dataset (e.g., MMLU, SQuAD) |
| Generate Synthetic | RAG_EVALUATION only | You have a PDF and want to generate Q&A from it |
Upload
Section titled “Upload”- Open your dataset → Add Content → Upload
- Select your ground truth file (required) and optionally an examples file
- Files must be JSON or JSONL, max 10 MB each
- Upload and confirm
Manual Capture
Section titled “Manual Capture”- Open your dataset → Add Content → Manual Capture
- Select a FloTorch Chat model and version
- Enter a question and click Get answer
- Edit the answer if needed, then add to the collection
- Repeat until you have enough pairs
- Click Save to upload to the dataset
Auto Capture
Section titled “Auto Capture”- Open your dataset → Add Content → Auto Capture
- Select one or more FloTorch Chat models to watch
- Set the target number of Q&A pairs (10–1000)
- Click Start — capture runs in the background
- When your Gateway receives traffic for those models, pairs are captured automatically
- You can stop capture early or wait until the target is reached
Import from HuggingFace
Section titled “Import from HuggingFace”- Open your dataset → Add Content → Import from HuggingFace
- Enter the HuggingFace repository ID (e.g.,
allenai/mmlu) - Specify the source file and map columns to question and answer
- Optionally add an examples file and mappings
- Click Import — the job runs in the background
Generate Synthetic (RAG_EVALUATION only)
Section titled “Generate Synthetic (RAG_EVALUATION only)”- Open your dataset → Add Content → Generate Synthetic
- Upload a PDF file (max 50 MB)
- Select a FloTorch Chat model
- Enter how many Q&A pairs to generate (1–10,000)
- Click Generate — the job runs in the background
Step 3: Use Your Dataset in Evaluation Projects
Section titled “Step 3: Use Your Dataset in Evaluation Projects”- Go to Evaluations → Projects → Create Project
- Select your dataset from the list
- For RAG_EVALUATION: also select a Knowledge Base
- For MODEL_EVALUATION: no Knowledge Base needed
- Configure and run experiments
Your dataset provides the ground truth used to score model responses.
Ways to Add Content (Summary)
Section titled “Ways to Add Content (Summary)”| Method | RAG_EVALUATION | MODEL_EVALUATION |
|---|---|---|
| Upload files | ✓ | ✓ |
| Manual Capture | ✓ | ✓ |
| Auto Capture (Gateway) | ✓ | ✓ |
| Import from HuggingFace | ✓ | ✓ |
| Generate Synthetic (PDF) | ✓ | — |
- Use clear dataset names that describe the purpose (e.g.,
customer-support-qa) - Add a description so your team understands the contents
- Start with a small dataset to verify format, then add more
- Replace files to update datasets — uploading a new file of the same type overwrites the previous one
- Dataset names cannot be changed after creation
- Datasets cannot be deleted
- Max 10 MB per uploaded file; 50 MB for synthetic PDF source