Azure OpenAI

Self-hosting

Model providers

More providers

You can use Letta with OpenAI if you have an OpenAI account and API key. Once you have set your AZURE_API_KEY and AZURE_BASE_URL specified in your environment variables, you can select what model and configure the context window size

Currently, Letta supports the following OpenAI models:

gpt-4 (recommended for advanced reasoning)
gpt-4o-mini (recommended for low latency and cost)
gpt-4o
gpt-4-turbo (not recommended, should use gpt-4o-mini instead)
gpt-3.5-turbo (not recommended, should use gpt-4o-mini instead)

Enabling Azure OpenAI with Docker

To enable Azure OpenAI models when running the Letta server with Docker, set your AZURE_API_KEY and AZURE_BASE_URL as environment variables:

# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  -e AZURE_API_KEY="your_azure_api_key" \
  -e AZURE_BASE_URL="your_azure_base_url" \
  -e AZURE_API_VERSION="your_azure_api_version" \
  letta/letta:latest

Optionally, you can specify the API version (default is 2024-09-01-preview):

-e AZURE_API_VERSION="2024-09-01-preview"

See the self-hosting guide for more information on running Letta with Docker.

Specifying agent models

When creating agents on your self-hosted server, you must specify both the LLM and embedding models to use via a handle. You can additionally specify a context window limit (which must be less than or equal to the maximum size).

from letta_client import Letta
import os

# Connect to your self-hosted server
client = Letta(base_url="http://localhost:8283")

azure_agent = client.agents.create(
    model="azure/gpt-4o-mini",
    embedding="azure/text-embedding-3-small",  # An embedding model is required for self-hosted
    # optional configuration
    context_window_limit=16000
)