Self-hosting Letta

Self-hosting

Running the Letta Server

To run the server with Docker, run the command:

# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  -e OPENAI_API_KEY="your_openai_api_key" \
  letta/letta:latest

This will run the Letta server with the OpenAI provider enabled, and store all data in the folder ~/.letta/.persist/pgdata.

If you have many different LLM API keys, you can also set up a .env file instead and pass that to docker run:

# using a .env file instead of passing environment variables
docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  --env-file .env \
  letta/letta:latest

Once the Letta server is running, you can access it via port 8283 (e.g. sending REST API requests to http://localhost:8283/v1). You can also connect your server to the Letta ADE to access and manage your agents in a web interface.

Enabling model providers

The Letta server can be connected to various LLM API backends (OpenAI, Anthropic, vLLM, Ollama, etc.). To enable access to these LLM API providers, set the appropriate environment variables when you use docker run:

# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  -e OPENAI_API_KEY="your_openai_api_key" \
  -e ANTHROPIC_API_KEY="your_anthropic_api_key" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434" \
  letta/letta:latest

The example above will make all compatible models running on OpenAI, Anthropic, and Ollama available to your Letta server.

Configuring embedding models

When self-hosting, you must specify an embedding model when creating agents. Letta uses embeddings for archival memory search and retrieval.

Supported embedding providers

When creating agents on your self-hosted server, specify the embedding parameter:

Python
TypeScript

from letta_client import Letta

# Connect to your self-hosted server

client = Letta(base_url="http://localhost:8283")

# Create agent with explicit embedding configuration

agent = client.agents.create(
model="openai/gpt-4o-mini",
embedding="openai/text-embedding-3-small", # Required for self-hosted
memory_blocks=[
{"label": "persona", "value": "I am a helpful assistant."}
]
)

import { LettaClient } from "@letta-ai/letta-client";

// Connect to your self-hosted server
const client = new LettaClient({
  baseUrl: "http://localhost:8283",
});

// Create agent with explicit embedding configuration
const agent = await client.agents.create({
  model: "openai/gpt-4o-mini",
  embedding: "openai/text-embedding-3-small", // Required for self-hosted
  memoryBlocks: [{ label: "persona", value: "I am a helpful assistant." }],
});

Available embedding models

The embedding model you can use depends on which provider you’ve configured:

OpenAI (requires OPENAI_API_KEY):

openai/text-embedding-3-small (recommended)
openai/text-embedding-3-large
openai/text-embedding-ada-002

Azure OpenAI (requires Azure configuration):

azure/text-embedding-3-small
azure/text-embedding-ada-002

Ollama (requires OLLAMA_BASE_URL):

ollama/mxbai-embed-large
ollama/nomic-embed-text
Any embedding model available in your Ollama instance

Optional: Telemetry with ClickHouse

Letta supports optional telemetry using ClickHouse. Telemetry provides observability features like traces, LLM request logging, and performance metrics. See the telemetry guide for setup instructions.

Password protection

To password protect your server, include SECURE=true and LETTA_SERVER_PASSWORD=yourpassword in your docker run command:

# If LETTA_SERVER_PASSWORD isn't set, the server will autogenerate a password
docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  --env-file .env \
  -e SECURE=true \
  -e LETTA_SERVER_PASSWORD=yourpassword \
  letta/letta:latest

With password protection enabled, you will have to provide your password in the bearer token header in your API requests:

// install letta-client with `npm install @letta-ai/letta-client`
import { LettaClient } from "@letta-ai/letta-client";

// create the client with the token set to your password
const client = new LettaClient({
  baseUrl: "http://localhost:8283",
  token: "yourpassword",
});

# install letta_client with `pip install letta-client`
from letta_client import Letta

# create the client with the token set to your password
client = Letta(
  base_url="http://localhost:8283",
  token="yourpassword"
)

curl --request POST \
  --url http://localhost:8283/v1/agents/$AGENT_ID/messages \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer yourpassword' \
  --data '{
  "messages": [
    {
      "role": "user",
      "text": "hows it going????"
    }
  ]
}'

Tool sandboxing

To enable tool sandboxing, set the E2B_API_KEY and E2B_SANDBOX_TEMPLATE_ID environment variables (via E2B) when you use docker run. When sandboxing is enabled, all custom tools (created by users from source code) will be executed in a sandboxed environment.

This does not include MCP tools, which are executed outside of the Letta server (on the MCP server itself), or built-in tools (like memory_insert), whose code cannot be modified after server startup.