Introduction
FloTorch Datasets are structured collections of data that enable you to evaluate, test, and improve your AI models and applications. Datasets provide a centralized way to manage test data, ground truth information, and evaluation examples that can be reused across multiple evaluation projects and experiments.
Dataset Types
Section titled “Dataset Types”FloTorch supports three types of datasets, each designed for specific evaluation scenarios:
RAG Evaluation
Section titled “RAG Evaluation”Purpose: Evaluate Retrieval-Augmented Generation (RAG) systems and pipelines.
Use Cases:
- Testing RAG retrieval accuracy
- Evaluating answer quality against ground truth
- Benchmarking different RAG configurations
- Comparing vector storage and embedding models
Required Files:
- Ground Truth (required): Questions and expected answers for evaluation
- Examples (optional): Few-shot examples to improve prompt quality
Model Evaluation
Section titled “Model Evaluation”Purpose: Evaluate the performance and accuracy of FloTorch models.
Use Cases:
- Testing model accuracy against known answers
- Comparing different model versions
- Benchmarking model configurations
- Evaluating model behavior with different parameters
Required Files:
- Ground Truth (required): Test questions and expected responses
- Examples (optional): Few-shot examples for in-context learning
Purpose: Provide conversational data for testing chat-based applications.
Use Cases:
- Testing chatbot behavior
- Evaluating conversation flows
- Training or fine-tuning chat models
- Testing multi-turn conversations
Required Files:
- Messages (required): Conversation data with role-based messages
File Format Requirements
Section titled “File Format Requirements”All dataset files must be in JSON or JSONL (newline-delimited JSON) format with a maximum file size of 10 MB.
Supported Formats
Section titled “Supported Formats”-
JSON Array Format:
[{ "question": "What is the capital of France?", "answer": "Paris" },{ "question": "What is 2+2?", "answer": "4" }] -
JSONL Format (one JSON object per line):
{"question": "What is the capital of France?", "answer": "Paris"}{"question": "What is 2+2?", "answer": "4"} -
JSON Object Wrapper (for examples files only):
{"examples": [{ "question": "Example question?", "answer": "Example answer" }]}
File Type Specifications
Section titled “File Type Specifications”Ground Truth File
Section titled “Ground Truth File”Used for RAG Evaluation and Model Evaluation datasets.
Required Fields:
question(string): The input question or promptanswer(string): The expected answer or response
Example:
[ { "question": "What is the capital of France?", "answer": "Paris" }, { "question": "What is the largest planet in our solar system?", "answer": "Jupiter" }]Additional fields: You can include additional fields for metadata, which will be preserved but not validated.
Examples File
Section titled “Examples File”Used for RAG Evaluation and Model Evaluation datasets (optional).
Required Fields:
question(string): Example questionanswer(string): Example answer
Example:
{ "examples": [ { "question": "What is machine learning?", "answer": "Machine learning is a subset of artificial intelligence..." } ]}Messages File
Section titled “Messages File”Used for Chat datasets.
Required Fields:
role(string): Must be one of:user,assistant, orsystemcontent(string): The message content
Example:
[ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello, how are you?" }, { "role": "assistant", "content": "I'm doing well, thank you for asking!" }]Key Features
Section titled “Key Features”Centralized Data Management
Section titled “Centralized Data Management”- Store evaluation data in one place
- Reuse datasets across multiple evaluation projects
- Version control through file replacement
- Easy download and sharing of dataset files
File Validation
Section titled “File Validation”FloTorch automatically validates all uploaded files to ensure:
- Correct JSON/JSONL format
- Required fields are present
- File size is within limits
- Schema compliance for each dataset type
Workspace-Level Organization
Section titled “Workspace-Level Organization”- Datasets are scoped to workspaces
- Share datasets with team members
- Unique naming within each workspace
- Role-based access control
Naming Requirements
Section titled “Naming Requirements”Dataset names must follow these rules:
- Alphanumeric characters and dashes only (a-z, A-Z, 0-9, -)
- Must be unique within the workspace
- Cannot be changed after creation (to maintain referential integrity)
Access Control
Section titled “Access Control”Dataset operations require specific workspace roles:
- List Datasets: Workspace Member or higher
- Create Dataset: Workspace Developer, Admin, or Org Admin
- Update Dataset: Workspace Developer, Admin, or Org Admin
- Upload Files: Workspace Developer, Admin, or Org Admin
- Download Files: Workspace Member or higher
Use in Evaluation Projects
Section titled “Use in Evaluation Projects”Once created, datasets can be used in evaluation projects to:
- RAG Benchmarking: Test and compare RAG pipeline configurations
- Model Testing: Evaluate FloTorch model performance
- A/B Testing: Compare different model versions or configurations
- Quality Assurance: Ensure consistent model behavior
Datasets are referenced when creating evaluation projects, and the data is used to run experiments and generate evaluation metrics.
Best Practices
Section titled “Best Practices”- Use descriptive names: Choose names that clearly indicate the dataset’s purpose (e.g.,
customer-support-qa,product-faq-v1) - Include metadata: Add description fields to explain the dataset’s purpose and contents
- Keep files focused: Create separate datasets for different evaluation scenarios
- Version your data: Use naming conventions like
-v1,-v2to track dataset versions - Validate locally first: Test your JSON/JSONL files with a validator before uploading
- Start small: Begin with a small dataset to ensure correct format, then scale up
- Document your schema: If using additional fields, document them for team members
- Regular updates: Keep datasets current by replacing files with updated versions
Limitations
Section titled “Limitations”- File Size: Maximum 10 MB per file
- File Format: Only JSON and JSONL formats supported
- Name Immutability: Dataset names cannot be changed after creation
- No Deletion: Datasets cannot be deleted (to prevent accidental data loss)
- File Replacement: Uploading a new file of the same type replaces the previous file