Ollama
Setup Ollama
Section titled “Setup Ollama”- Download + install Ollama and the model you want to test with
- Download a model to test with by running
ollama pull <MODEL_NAME>in the terminal (check the Ollama model library for available models)
For example, if we want to use Dolphin 2.2.1 Mistral, we can download it by running:
# Let's use the q6_K variantollama pull dolphin2.2-mistral:7b-q6_Kpulling manifestpulling d8a5ee4aba09... 100% |█████████████████████████████████████████████████████████████████████████| (4.1/4.1 GB, 20 MB/s)pulling a47b02e00552... 100% |██████████████████████████████████████████████████████████████████████████████| (106/106 B, 77 B/s)pulling 9640c2212a51... 100% |████████████████████████████████████████████████████████████████████████████████| (41/41 B, 22 B/s)pulling de6bcd73f9b4... 100% |████████████████████████████████████████████████████████████████████████████████| (58/58 B, 28 B/s)pulling 95c3d8d4429f... 100% |█████████████████████████████████████████████████████████████████████████████| (455/455 B, 330 B/s)verifying sha256 digestwriting manifestremoving any unused layerssuccessEnabling Ollama with Docker
Section titled “Enabling Ollama with Docker”To enable Ollama models when running the Letta server with Docker, set the OLLAMA_BASE_URL environment variable.
macOS/Windows:
Since Ollama is running on the host network, you will need to use host.docker.internal to connect to the Ollama server instead of localhost.
# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent datadocker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ -p 8283:8283 \ -e OLLAMA_BASE_URL="http://host.docker.internal:11434" \ letta/letta:latestLinux:
Use --network host and localhost:
docker run \ -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \ --network host \ -e OLLAMA_BASE_URL="http://localhost:11434" \ letta/letta:latestSee the self-hosting guide for more information on running Letta with Docker.
Specifying agent models
Section titled “Specifying agent models”When creating agents on your self-hosted server, you must specify both the LLM and embedding models to use via a handle. You can additionally specify a context window limit (which must be less than or equal to the maximum size).
from letta_client import Lettaimport os
# Connect to your self-hosted serverclient = Letta(base_url="http://localhost:8283")
ollama_agent = client.agents.create( model="ollama/thewindmom/hermes-3-llama-3.1-8b:latest", embedding="ollama/mxbai-embed-large", # An embedding model is required for self-hosted # optional configuration context_window_limit=16000)