Skip to content
  • Auto
  • Light
  • Dark
DiscordForumGitHubSign up
View as Markdown
Copy Markdown

Open in Claude
Open in ChatGPT

Models

List Llm Models
models.list(ModelListParams**kwargs) -> ModelListResponse
get/v1/models/
ModelsExpand Collapse
class EmbeddingConfig:

Configuration for embedding model connection and processing parameters.

embedding_dim: int

The dimension of the embedding.

embedding_endpoint_type: Literal["openai", "anthropic", "bedrock", 16 more]

The endpoint type for the model.

Accepts one of the following:
"openai"
"anthropic"
"bedrock"
"google_ai"
"google_vertex"
"azure"
"groq"
"ollama"
"webui"
"webui-legacy"
"lmstudio"
"lmstudio-legacy"
"llamacpp"
"koboldcpp"
"vllm"
"hugging-face"
"mistral"
"together"
"pinecone"
embedding_model: str

The model for the embedding.

azure_deployment: Optional[str]

The Azure deployment for the model.

azure_endpoint: Optional[str]

The Azure endpoint for the model.

azure_version: Optional[str]

The Azure version for the model.

batch_size: Optional[int]

The maximum batch size for processing embeddings.

embedding_chunk_size: Optional[int]

The chunk size of the embedding.

embedding_endpoint: Optional[str]

The endpoint for the model (None if local).

handle: Optional[str]

The handle for this config, in the format provider/model-name.

class EmbeddingModel:
display_name: str

Display name for the model shown in UI

embedding_dim: int

The dimension of the embedding

Deprecatedembedding_endpoint_type: Literal["openai", "anthropic", "bedrock", 16 more]

Deprecated: Use 'provider_type' field instead. The endpoint type for the embedding model.

Accepts one of the following:
"openai"
"anthropic"
"bedrock"
"google_ai"
"google_vertex"
"azure"
"groq"
"ollama"
"webui"
"webui-legacy"
"lmstudio"
"lmstudio-legacy"
"llamacpp"
"koboldcpp"
"vllm"
"hugging-face"
"mistral"
"together"
"pinecone"
Deprecatedembedding_model: str

Deprecated: Use 'name' field instead. Embedding model name.

name: str

The actual model name used by the provider

provider_name: str

The name of the provider

provider_type: ProviderType

The type of the provider

Accepts one of the following:
"anthropic"
"azure"
"bedrock"
"cerebras"
"deepseek"
"google_ai"
"google_vertex"
"groq"
"hugging-face"
"letta"
"lmstudio_openai"
"mistral"
"ollama"
"openai"
"together"
"vllm"
"xai"
Deprecatedazure_deployment: Optional[str]

Deprecated: The Azure deployment for the model.

Deprecatedazure_endpoint: Optional[str]

Deprecated: The Azure endpoint for the model.

Deprecatedazure_version: Optional[str]

Deprecated: The Azure version for the model.

Deprecatedbatch_size: Optional[int]

Deprecated: The maximum batch size for processing embeddings.

Deprecatedembedding_chunk_size: Optional[int]

Deprecated: The chunk size of the embedding.

Deprecatedembedding_endpoint: Optional[str]

Deprecated: The endpoint for the model.

handle: Optional[str]

The handle for this config, in the format provider/model-name.

model_type: Optional[Literal["embedding"]]

Type of model (llm or embedding)

Accepts one of the following:
"embedding"
class LlmConfig:

Configuration for Language Model (LLM) connection and generation parameters.

context_window: int

The context window size for the model.

model: str

LLM model name.

model_endpoint_type: Literal["openai", "anthropic", "google_ai", 18 more]

The endpoint type for the model.

Accepts one of the following:
"openai"
"anthropic"
"google_ai"
"google_vertex"
"azure"
"groq"
"ollama"
"webui"
"webui-legacy"
"lmstudio"
"lmstudio-legacy"
"lmstudio-chatcompletions"
"llamacpp"
"koboldcpp"
"vllm"
"hugging-face"
"mistral"
"together"
"bedrock"
"deepseek"
"xai"
compatibility_type: Optional[Literal["gguf", "mlx"]]

The framework compatibility type for the model.

Accepts one of the following:
"gguf"
"mlx"
display_name: Optional[str]

A human-friendly display name for the model.

enable_reasoner: Optional[bool]

Whether or not the model should use extended thinking if it is a 'reasoning' style model

frequency_penalty: Optional[float]

Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. From OpenAI: Number between -2.0 and 2.0.

handle: Optional[str]

The handle for this config, in the format provider/model-name.

max_reasoning_tokens: Optional[int]

Configurable thinking budget for extended thinking. Used for enable_reasoner and also for Google Vertex models like Gemini 2.5 Flash. Minimum value is 1024 when used with enable_reasoner.

max_tokens: Optional[int]

The maximum number of tokens to generate. If not set, the model will use its default value.

model_endpoint: Optional[str]

The endpoint for the model.

model_wrapper: Optional[str]

The wrapper for the model.

parallel_tool_calls: Optional[bool]

If set to True, enables parallel tool calling. Defaults to False.

provider_category: Optional[ProviderCategory]

The provider category for the model.

Accepts one of the following:
"base"
"byok"
provider_name: Optional[str]

The provider name for the model.

put_inner_thoughts_in_kwargs: Optional[bool]

Puts 'inner_thoughts' as a kwarg in the function call if this is set to True. This helps with function calling performance and also the generation of inner thoughts.

reasoning_effort: Optional[Literal["minimal", "low", "medium", "high"]]

The reasoning effort to use when generating text reasoning models

Accepts one of the following:
"minimal"
"low"
"medium"
"high"
temperature: Optional[float]

The temperature to use when generating text with the model. A higher temperature will result in more random text.

tier: Optional[str]

The cost tier for the model (cloud only).

verbosity: Optional[Literal["low", "medium", "high"]]

Soft control for how verbose model output should be, used for GPT-5 models.

Accepts one of the following:
"low"
"medium"
"high"
class Model:
Deprecatedcontext_window: int

Deprecated: Use 'max_context_window' field instead. The context window size for the model.

max_context_window: int

The maximum context window for the model

Deprecatedmodel: str

Deprecated: Use 'name' field instead. LLM model name.

Deprecatedmodel_endpoint_type: Literal["openai", "anthropic", "google_ai", 18 more]

Deprecated: Use 'provider_type' field instead. The endpoint type for the model.

Accepts one of the following:
"openai"
"anthropic"
"google_ai"
"google_vertex"
"azure"
"groq"
"ollama"
"webui"
"webui-legacy"
"lmstudio"
"lmstudio-legacy"
"lmstudio-chatcompletions"
"llamacpp"
"koboldcpp"
"vllm"
"hugging-face"
"mistral"
"together"
"bedrock"
"deepseek"
"xai"
name: str

The actual model name used by the provider

provider_type: ProviderType

The type of the provider

Accepts one of the following:
"anthropic"
"azure"
"bedrock"
"cerebras"
"deepseek"
"google_ai"
"google_vertex"
"groq"
"hugging-face"
"letta"
"lmstudio_openai"
"mistral"
"ollama"
"openai"
"together"
"vllm"
"xai"
Deprecatedcompatibility_type: Optional[Literal["gguf", "mlx"]]

Deprecated: The framework compatibility type for the model.

Accepts one of the following:
"gguf"
"mlx"
display_name: Optional[str]

A human-friendly display name for the model.

Deprecatedenable_reasoner: Optional[bool]

Deprecated: Whether or not the model should use extended thinking if it is a 'reasoning' style model.

Deprecatedfrequency_penalty: Optional[float]

Deprecated: Positive values penalize new tokens based on their existing frequency in the text so far.

handle: Optional[str]

The handle for this config, in the format provider/model-name.

Deprecatedmax_reasoning_tokens: Optional[int]

Deprecated: Configurable thinking budget for extended thinking.

Deprecatedmax_tokens: Optional[int]

Deprecated: The maximum number of tokens to generate.

Deprecatedmodel_endpoint: Optional[str]

Deprecated: The endpoint for the model.

model_type: Optional[Literal["llm"]]

Type of model (llm or embedding)

Accepts one of the following:
"llm"
Deprecatedmodel_wrapper: Optional[str]

Deprecated: The wrapper for the model.

Deprecatedparallel_tool_calls: Optional[bool]

Deprecated: If set to True, enables parallel tool calling.

Deprecatedprovider_category: Optional[ProviderCategory]

Deprecated: The provider category for the model.

Accepts one of the following:
"base"
"byok"
provider_name: Optional[str]

The provider name for the model.

Deprecatedput_inner_thoughts_in_kwargs: Optional[bool]

Deprecated: Puts 'inner_thoughts' as a kwarg in the function call.

Deprecatedreasoning_effort: Optional[Literal["minimal", "low", "medium", "high"]]

Deprecated: The reasoning effort to use when generating text reasoning models.

Accepts one of the following:
"minimal"
"low"
"medium"
"high"
Deprecatedtemperature: Optional[float]

Deprecated: The temperature to use when generating text with the model.

Deprecatedtier: Optional[str]

Deprecated: The cost tier for the model (cloud only).

Deprecatedverbosity: Optional[Literal["low", "medium", "high"]]

Deprecated: Soft control for how verbose model output should be.

Accepts one of the following:
"low"
"medium"
"high"
ProviderCategory = Literal["base", "byok"]
Accepts one of the following:
"base"
"byok"
ProviderType = Literal["anthropic", "azure", "bedrock", 14 more]
Accepts one of the following:
"anthropic"
"azure"
"bedrock"
"cerebras"
"deepseek"
"google_ai"
"google_vertex"
"groq"
"hugging-face"
"letta"
"lmstudio_openai"
"mistral"
"ollama"
"openai"
"together"
"vllm"
"xai"

ModelsEmbeddings

List Embedding Models
models.embeddings.list() -> EmbeddingListResponse
get/v1/models/embedding