Embeddings
Model Registry supports Embedding models for generating vector embeddings. Use them with the gateway’s POST /openai/v1/embeddings endpoint or when configuring vector storage (e.g., Knowledge Base, RAG).
Creating an Embedding Model
Section titled “Creating an Embedding Model”- In FloTorch Console, go to Model Registry and click Create FloTorch Model (top right).
- In the form, set Type to Embedding.
- Enter a Name and optional Description. The name must be unique.
- Choose an Embedding provider (e.g., OpenAI, Azure, Google, Cohere, Amazon Bedrock, OpenRouter, or OpenAI Compatible).
- Select the Embedding model from that provider (e.g.,
text-embedding-3-small). - Optionally add Embedding model parameters (key/value) if the provider supports them.
- Submit the form.
The embedding model is created with a default version and is ready to use—no version configuration canvas or publish step is required. It appears in the Model Registry table with type Embedding.
Using Embedding Models
Section titled “Using Embedding Models”- Embeddings API – Call POST
/openai/v1/embeddingswithmodel: "flotorch/<your-embedding-model-name>"to generate embeddings. - Vector storage – When creating or editing a vector storage (e.g., ChromaDB, Pinecone, PgVector, LanceDB), select your FloTorch embedding model so the gateway uses it to embed documents and queries.
Embedding vs Chat Models
Section titled “Embedding vs Chat Models”| Aspect | Chat | Embedding |
|---|---|---|
| Purpose | Chat completions | Vector embeddings |
| Version canvas | Yes (router, cache, guardrails) | No |
| Publish step | Required to use | Not required; usable after create |
| Code snippet in Console | Available for published chat models | Not shown (embedding models use the embeddings API) |