SDK: LLM
Overview
Section titled “Overview”The LLM client provides a typed interface to FloTorch’s chat completion API with built-in logging and response parsing.
Exports:
FlotorchLLM: synchronous and asynchronous inference (invoke,ainvoke)LLMResponse: structured result withcontentandmetadata
Endpoint: /api/openai/v1/chat/completions
from flotorch.sdk.llm import FlotorchLLM
API_KEY = "<your_api_key>"BASE_URL = "https://gateway.flotorch.cloud"MODEL_ID = "<your_flotorch_model_id>" # e.g., a model from FloTorch Model Registry
llm = FlotorchLLM(model_id=MODEL_ID, api_key=API_KEY, base_url=BASE_URL)FlotorchLLM
Section titled “FlotorchLLM”FlotorchLLM( model_id: str, api_key: str, base_url: str,)Creates an LLM client bound to a model and endpoint.
invoke(messages, tools=None, response_format=None, extra_body=None, **kwargs) -> LLMResponse
Section titled “invoke(messages, tools=None, response_format=None, extra_body=None, **kwargs) -> LLMResponse”Sends a chat completion request. Arguments map to OpenAI-style parameters plus FloTorch extensions:
messages: list of{role, content}dictstools(optional): OpenAI tool definitionsresponse_format(optional): supports JSON schema viaconvert_pydantic_to_custom_json_schemaextra_body(optional): merged intoextra_bodyfield**kwargs: additional parameters liketemperature,max_tokens,top_p
Returns LLMResponse:
from typing import Dict, Any
resp = llm.invoke([ {"role": "system", "content": "You are helpful."}, {"role": "user", "content": "Summarize FloTorch in one line."},], temperature=0.3)
print(resp.content)print(resp.metadata["totalTokens"]) # includes prompt/completion/totalainvoke(messages, tools=None, response_format=None, extra_body=None, **kwargs) -> LLMResponse
Section titled “ainvoke(messages, tools=None, response_format=None, extra_body=None, **kwargs) -> LLMResponse”Asynchronous version of invoke.
import asyncio
async def main(): r = await llm.ainvoke([ {"role": "user", "content": "A haiku about the sea."} ]) print(r.content)
asyncio.run(main())Response shape
Section titled “Response shape”LLMResponse contains:
content: str– final text content (empty if the model returned onlytool_calls)metadata: Dict[str, Any]– includesinputTokens,outputTokens,totalTokens, and raw API response underraw_response