Skip to content

Dataset Management

This guide covers how to manage datasets in FloTorch, from creation to viewing and downloading.

  1. Go to your WorkspaceDatasets
  2. Click Create Dataset (or New Dataset)
  3. Enter the dataset information:
    • Name: Unique identifier (lowercase letters, numbers, hyphens only)
    • Description: Optional summary of the dataset’s purpose
    • Type: Select RAG_EVALUATION or MODEL_EVALUATION
  4. Click Create

After creation, add content using one of the methods in Datasets Introduction: Upload, Manual Capture, Auto Capture, Import from HuggingFace, or Generate Synthetic.

  1. Open your dataset → Add ContentUpload
  2. Use drag-and-drop or click to browse
  3. Select your ground truth file (required) and optionally an examples file
  4. Files must be JSON or JSONL, max 10 MB each
  5. Upload and confirm

The system validates format, size, and required fields.


The Datasets section lists all datasets in your workspace with:

  • Dataset name
  • Description
  • Type (RAG_EVALUATION or MODEL_EVALUATION)
  • Creation and update timestamps
  • Search: Find datasets by name or description
  • Type: Filter by RAG_EVALUATION or MODEL_EVALUATION
  • Pagination: Navigate through large lists

To download files for backup or review:

  1. Open the dataset
  2. In the files section, find the file you want
  3. Click Download next to the file

You can use downloads to back up data, share with others, or review offline.


Dataset deletion is not supported to prevent accidental data loss and protect evaluation projects that reference datasets.

If you no longer need a dataset:

  • Add “DEPRECATED” or similar note in the description
  • Create a new dataset with updated data instead

“Invalid file type”
Ensure the file has a .json or .jsonl extension and contains valid JSON.

“File too large”
Reduce file size to under 10 MB. Split into multiple datasets or remove unnecessary fields if needed.

“Invalid file content”
Check that required fields are present and values match expected types.

“Question cannot be empty”
All question fields must contain non-empty strings.

“Invalid JSON on line X”
For JSONL files, ensure each line is a valid JSON object.

“File must contain a JSON array”
For JSON format, wrap your data in square brackets.


  • Use clear, consistent naming for datasets
  • Add version info to descriptions (e.g., “v2 - updated March 2024”)
  • Download and back up important datasets regularly
  • Validate files locally before uploading
  • Start with a small sample to verify format
  • Add descriptions so your team understands each dataset’s purpose
  • Write clear descriptions for shared datasets
  • Use consistent naming conventions across the team
  • Notify team members when adding or changing content

  1. Go to EvaluationsProjectsCreate Project
  2. Select your dataset during project setup
  3. For RAG_EVALUATION: select a Knowledge Base
  4. For MODEL_EVALUATION: no Knowledge Base needed
  5. Configure and run experiments

The evaluation system uses questions from your ground truth as inputs and compares model outputs to expected answers to generate metrics.


  • Dataset names cannot be changed after creation
  • Datasets cannot be deleted
  • Max 10 MB per uploaded file; 50 MB for synthetic PDF source