List Llm Models

get/v1/models/

List available LLM models using the asynchronous implementation for improved performance.

Returns Model format which extends LLMConfig with additional metadata fields. Legacy LLMConfig fields are marked as deprecated but still available for backward compatibility.

ParametersExpand Collapse

provider_category: Optional[List[ProviderCategory]]

Accepts one of the following:

"base"

"byok"

provider_name: Optional[str]

provider_type: Optional[ProviderType]

Accepts one of the following:

"anthropic"

"azure"

"bedrock"

"cerebras"

"deepseek"

"google_ai"

"google_vertex"

"groq"

"hugging-face"

"letta"

"lmstudio_openai"

"mistral"

"ollama"

"openai"

"together"

"vllm"

"xai"

ReturnsExpand Collapse

ModelListResponse = List[Model]

Deprecatedcontext_window: int

Deprecated: Use 'max_context_window' field instead. The context window size for the model.

max_context_window: int

The maximum context window for the model

Deprecatedmodel: str

Deprecated: Use 'name' field instead. LLM model name.

Deprecatedmodel_endpoint_type: Literal["openai", "anthropic", "google_ai", 18 more]

Deprecated: Use 'provider_type' field instead. The endpoint type for the model.

Accepts one of the following:

"openai"

"anthropic"

"google_ai"

"google_vertex"

"azure"

"groq"

"ollama"

"webui"

"webui-legacy"

"lmstudio"

"lmstudio-legacy"

"lmstudio-chatcompletions"

"llamacpp"

"koboldcpp"

"vllm"

"hugging-face"

"mistral"

"together"

"bedrock"

"deepseek"

"xai"

name: str

The actual model name used by the provider

provider_type: ProviderType

The type of the provider

Accepts one of the following:

"anthropic"

"azure"

"bedrock"

"cerebras"

"deepseek"

"google_ai"

"google_vertex"

"groq"

"hugging-face"

"letta"

"lmstudio_openai"

"mistral"

"ollama"

"openai"

"together"

"vllm"

"xai"

Deprecatedcompatibility_type: Optional[Literal["gguf", "mlx"]]

Deprecated: The framework compatibility type for the model.

Accepts one of the following:

"gguf"

"mlx"

display_name: Optional[str]

A human-friendly display name for the model.

Deprecatedenable_reasoner: Optional[bool]

Deprecated: Whether or not the model should use extended thinking if it is a 'reasoning' style model.

Deprecatedfrequency_penalty: Optional[float]

Deprecated: Positive values penalize new tokens based on their existing frequency in the text so far.

handle: Optional[str]

The handle for this config, in the format provider/model-name.

Deprecatedmax_reasoning_tokens: Optional[int]

Deprecated: Configurable thinking budget for extended thinking.

Deprecatedmax_tokens: Optional[int]

Deprecated: The maximum number of tokens to generate.

Deprecatedmodel_endpoint: Optional[str]

Deprecated: The endpoint for the model.

model_type: Optional[Literal["llm"]]

Type of model (llm or embedding)

Accepts one of the following:

"llm"

Deprecatedmodel_wrapper: Optional[str]

Deprecated: The wrapper for the model.

Deprecatedparallel_tool_calls: Optional[bool]

Deprecated: If set to True, enables parallel tool calling.

Deprecatedprovider_category: Optional[ProviderCategory]

Deprecated: The provider category for the model.

Accepts one of the following:

"base"

"byok"

provider_name: Optional[str]

The provider name for the model.

Deprecatedput_inner_thoughts_in_kwargs: Optional[bool]

Deprecated: Puts 'inner_thoughts' as a kwarg in the function call.

Deprecatedreasoning_effort: Optional[Literal["minimal", "low", "medium", "high"]]

Deprecated: The reasoning effort to use when generating text reasoning models.

Accepts one of the following:

"minimal"

"low"

"medium"

"high"

Deprecatedtemperature: Optional[float]

Deprecated: The temperature to use when generating text with the model.

Deprecatedtier: Optional[str]

Deprecated: The cost tier for the model (cloud only).

Deprecatedverbosity: Optional[Literal["low", "medium", "high"]]

Deprecated: Soft control for how verbose model output should be.

Accepts one of the following:

"low"

"medium"

"high"

List Llm Models

from letta_client import Letta

client = Letta(
    api_key="My API Key",
)
models = client.models.list()
print(models)

[
  {
    "context_window": 0,
    "max_context_window": 0,
    "model": "model",
    "model_endpoint_type": "openai",
    "name": "name",
    "provider_type": "anthropic",
    "compatibility_type": "gguf",
    "display_name": "display_name",
    "enable_reasoner": true,
    "frequency_penalty": 0,
    "handle": "handle",
    "max_reasoning_tokens": 0,
    "max_tokens": 0,
    "model_endpoint": "model_endpoint",
    "model_type": "llm",
    "model_wrapper": "model_wrapper",
    "parallel_tool_calls": true,
    "provider_category": "base",
    "provider_name": "provider_name",
    "put_inner_thoughts_in_kwargs": true,
    "reasoning_effort": "minimal",
    "temperature": 0,
    "tier": "tier",
    "verbosity": "low"
  }
]

Returns Examples

[
  {
    "context_window": 0,
    "max_context_window": 0,
    "model": "model",
    "model_endpoint_type": "openai",
    "name": "name",
    "provider_type": "anthropic",
    "compatibility_type": "gguf",
    "display_name": "display_name",
    "enable_reasoner": true,
    "frequency_penalty": 0,
    "handle": "handle",
    "max_reasoning_tokens": 0,
    "max_tokens": 0,
    "model_endpoint": "model_endpoint",
    "model_type": "llm",
    "model_wrapper": "model_wrapper",
    "parallel_tool_calls": true,
    "provider_category": "base",
    "provider_name": "provider_name",
    "put_inner_thoughts_in_kwargs": true,
    "reasoning_effort": "minimal",
    "temperature": 0,
    "tier": "tier",
    "verbosity": "low"
  }
]