Self-hosting Letta
Running the Letta Server
Section titled “Running the Letta Server”To run the server with Docker, run the command:
# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent datadocker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ -p 8283:8283 \ -e OPENAI_API_KEY="your_openai_api_key" \ letta/letta:latestThis will run the Letta server with the OpenAI provider enabled, and store all data in the folder ~/.letta/.persist/pgdata.
If you have many different LLM API keys, you can also set up a .env file instead and pass that to docker run:
# using a .env file instead of passing environment variablesdocker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ -p 8283:8283 \ --env-file .env \ letta/letta:latestOnce the Letta server is running, you can access it via port 8283 (e.g. sending REST API requests to http://localhost:8283/v1). You can also connect your server to the Letta ADE to access and manage your agents in a web interface.
Enabling model providers
Section titled “Enabling model providers”The Letta server can be connected to various LLM API backends (OpenAI, Anthropic, vLLM, Ollama, etc.). To enable access to these LLM API providers, set the appropriate environment variables when you use docker run:
# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent datadocker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ -p 8283:8283 \ -e OPENAI_API_KEY="your_openai_api_key" \ -e ANTHROPIC_API_KEY="your_anthropic_api_key" \ -e OLLAMA_BASE_URL="http://host.docker.internal:11434" \ letta/letta:latestThe example above will make all compatible models running on OpenAI, Anthropic, and Ollama available to your Letta server.
Configuring embedding models
Section titled “Configuring embedding models”When self-hosting, you must specify an embedding model when creating agents. Letta uses embeddings for archival memory search and retrieval.
Supported embedding providers
Section titled “Supported embedding providers”When creating agents on your self-hosted server, specify the embedding parameter:
from letta_client import Letta
# Connect to your self-hosted server
client = Letta(base_url="http://localhost:8283")
# Create agent with explicit embedding configuration
agent = client.agents.create(model="openai/gpt-4o-mini",embedding="openai/text-embedding-3-small", # Required for self-hostedmemory_blocks=[{"label": "persona", "value": "I am a helpful assistant."}])import { LettaClient } from "@letta-ai/letta-client";
// Connect to your self-hosted serverconst client = new LettaClient({ baseUrl: "http://localhost:8283",});
// Create agent with explicit embedding configurationconst agent = await client.agents.create({ model: "openai/gpt-4o-mini", embedding: "openai/text-embedding-3-small", // Required for self-hosted memoryBlocks: [{ label: "persona", value: "I am a helpful assistant." }],});Available embedding models
Section titled “Available embedding models”The embedding model you can use depends on which provider you’ve configured:
OpenAI (requires OPENAI_API_KEY):
openai/text-embedding-3-small(recommended)openai/text-embedding-3-largeopenai/text-embedding-ada-002
Azure OpenAI (requires Azure configuration):
azure/text-embedding-3-smallazure/text-embedding-ada-002
Ollama (requires OLLAMA_BASE_URL):
ollama/mxbai-embed-largeollama/nomic-embed-text- Any embedding model available in your Ollama instance
Optional: Telemetry with ClickHouse
Section titled “Optional: Telemetry with ClickHouse”Letta supports optional telemetry using ClickHouse. Telemetry provides observability features like traces, LLM request logging, and performance metrics. See the telemetry guide for setup instructions.
Password protection
Section titled “Password protection”To password protect your server, include SECURE=true and LETTA_SERVER_PASSWORD=yourpassword in your docker run command:
# If LETTA_SERVER_PASSWORD isn't set, the server will autogenerate a passworddocker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ -p 8283:8283 \ --env-file .env \ -e SECURE=true \ -e LETTA_SERVER_PASSWORD=yourpassword \ letta/letta:latestWith password protection enabled, you will have to provide your password in the bearer token header in your API requests:
// install letta-client with `npm install @letta-ai/letta-client`import { LettaClient } from "@letta-ai/letta-client";
// create the client with the token set to your passwordconst client = new LettaClient({ baseUrl: "http://localhost:8283", token: "yourpassword",});# install letta_client with `pip install letta-client`from letta_client import Letta
# create the client with the token set to your passwordclient = Letta( base_url="http://localhost:8283", token="yourpassword")curl --request POST \ --url http://localhost:8283/v1/agents/$AGENT_ID/messages \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer yourpassword' \ --data '{ "messages": [ { "role": "user", "text": "hows it going????" } ]}'Tool sandboxing
Section titled “Tool sandboxing”To enable tool sandboxing, set the E2B_API_KEY and E2B_SANDBOX_TEMPLATE_ID environment variables (via E2B) when you use docker run.
When sandboxing is enabled, all custom tools (created by users from source code) will be executed in a sandboxed environment.
This does not include MCP tools, which are executed outside of the Letta server (on the MCP server itself), or built-in tools (like memory_insert), whose code cannot be modified after server startup.