Skip to content
  • Auto
  • Light
  • Dark
DiscordForumGitHubSign up
Self-hosting
View as Markdown
Copy Markdown

Open in Claude
Open in ChatGPT

Self-hosting Letta

To run the server with Docker, run the command:

Terminal window
# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
-e OPENAI_API_KEY="your_openai_api_key" \
letta/letta:latest

This will run the Letta server with the OpenAI provider enabled, and store all data in the folder ~/.letta/.persist/pgdata.

If you have many different LLM API keys, you can also set up a .env file instead and pass that to docker run:

Terminal window
# using a .env file instead of passing environment variables
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
--env-file .env \
letta/letta:latest

Once the Letta server is running, you can access it via port 8283 (e.g. sending REST API requests to http://localhost:8283/v1). You can also connect your server to the Letta ADE to access and manage your agents in a web interface.

The Letta server can be connected to various LLM API backends (OpenAI, Anthropic, vLLM, Ollama, etc.). To enable access to these LLM API providers, set the appropriate environment variables when you use docker run:

Terminal window
# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
-e OPENAI_API_KEY="your_openai_api_key" \
-e ANTHROPIC_API_KEY="your_anthropic_api_key" \
-e OLLAMA_BASE_URL="http://host.docker.internal:11434" \
letta/letta:latest

The example above will make all compatible models running on OpenAI, Anthropic, and Ollama available to your Letta server.

When self-hosting, you must specify an embedding model when creating agents. Letta uses embeddings for archival memory search and retrieval.

When creating agents on your self-hosted server, specify the embedding parameter:

from letta_client import Letta
# Connect to your self-hosted server
client = Letta(base_url="http://localhost:8283")
# Create agent with explicit embedding configuration
agent = client.agents.create(
model="openai/gpt-4o-mini",
embedding="openai/text-embedding-3-small", # Required for self-hosted
memory_blocks=[
{"label": "persona", "value": "I am a helpful assistant."}
]
)

The embedding model you can use depends on which provider you’ve configured:

OpenAI (requires OPENAI_API_KEY):

  • openai/text-embedding-3-small (recommended)
  • openai/text-embedding-3-large
  • openai/text-embedding-ada-002

Azure OpenAI (requires Azure configuration):

  • azure/text-embedding-3-small
  • azure/text-embedding-ada-002

Ollama (requires OLLAMA_BASE_URL):

  • ollama/mxbai-embed-large
  • ollama/nomic-embed-text
  • Any embedding model available in your Ollama instance

Letta supports optional telemetry using ClickHouse. Telemetry provides observability features like traces, LLM request logging, and performance metrics. See the telemetry guide for setup instructions.

To password protect your server, include SECURE=true and LETTA_SERVER_PASSWORD=yourpassword in your docker run command:

Terminal window
# If LETTA_SERVER_PASSWORD isn't set, the server will autogenerate a password
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
--env-file .env \
-e SECURE=true \
-e LETTA_SERVER_PASSWORD=yourpassword \
letta/letta:latest

With password protection enabled, you will have to provide your password in the bearer token header in your API requests:

// install letta-client with `npm install @letta-ai/letta-client`
import { LettaClient } from "@letta-ai/letta-client";
// create the client with the token set to your password
const client = new LettaClient({
baseUrl: "http://localhost:8283",
token: "yourpassword",
});

To enable tool sandboxing, set the E2B_API_KEY and E2B_SANDBOX_TEMPLATE_ID environment variables (via E2B) when you use docker run. When sandboxing is enabled, all custom tools (created by users from source code) will be executed in a sandboxed environment.

This does not include MCP tools, which are executed outside of the Letta server (on the MCP server itself), or built-in tools (like memory_insert), whose code cannot be modified after server startup.