Skip to content
  • Auto
  • Light
  • Dark
DiscordForumGitHubSign up
View as Markdown
Copy Markdown

Open in Claude
Open in ChatGPT

Building Stateful Agents with Letta

Letta agents can automatically manage long-term memory, load data from external sources, and call custom tools. Unlike in other frameworks, Letta agents are stateful, so they keep track of historical interactions and reserve part of their context to read and write memories which evolve over time.

Letta manages a reasoning loop for agents. At each agent step (i.e. iteration of the loop), the state of the agent is checkpointed and persisted to the database.

You can interact with agents from a REST API, the ADE, and TypeScript / Python SDKs. As long as they are connected to the same service, all of these interfaces can be used to interact with the same agents.

In Letta, you can think of an agent as a single entity that has a single message history which is treated as infinite. The sequence of interactions the agent has experienced through its existence make up the agent’s state (or memory).

One distinction between Letta and other agent frameworks is that Letta does not have the notion of message threads (or sessions). Instead, there are only stateful agents, which have a single perpetual thread (sequence of messages).

The reason we use the term agent rather than thread is because Letta is based on the principle that all agents interactions should be part of the persistent memory, as opposed to building agent applications around ephemeral, short-lived interactions (like a thread or session).

%%{init: {'flowchart': {'rankDir': 'LR'}}}%%
flowchart LR
    subgraph Traditional["Thread-Based Agents"]
        direction TB
        llm1[LLM] --> thread1["Thread 1
        --------
        Ephemeral
        Session"]
        llm1 --> thread2["Thread 2
        --------
        Ephemeral
        Session"]
        llm1 --> thread3["Thread 3
        --------
        Ephemeral
        Session"]
    end

    Traditional ~~~ Letta

    subgraph Letta["Letta Stateful Agents"]
        direction TB
        llm2[LLM] --> agent["Single Agent
        --------
        Persistent Memory"]
        agent --> db[(PostgreSQL)]
        db -->|"Learn & Update"| agent
    end

    class thread1,thread2,thread3 session
    class agent agent

If you would like to create common starting points for new conversation “threads”, we recommending using agent templates to create new agents for each conversation, or directly copying agent state from an existing agent.

For multi-users applications, we recommend creating an agent per-user, though you can also have multiple users message a single agent (but it will be a single shared message history).

You can create a new agent via the REST API, Python SDK, or TypeScript SDK:

Terminal window
curl -X POST https://api.letta.com/v1/agents \
-H "Authorization: Bearer $LETTA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"memory_blocks": [
{
"value": "The human'\''s name is Bob the Builder.",
"label": "human"
},
{
"value": "My name is Sam, the all-knowing sentient AI.",
"label": "persona"
}
],
"model": "openai/gpt-4o-mini",
"context_window_limit": 16000
}'

You can also create an agent without any code using the Agent Development Environment (ADE). All Letta agents are stored in a database on the Letta server, so you can access the same agents from the ADE, the REST API, the Python SDK, and the TypeScript SDK.

The response will include information about the agent, including its id:

{
"id": "agent-43f8e098-1021-4545-9395-446f788d7389",
"name": "GracefulFirefly",
...
}

Once an agent is created, you can message it:

Terminal window
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "user",
"content": "hows it going????"
}
]
}'

The response object contains the following attributes:

  • usage: The usage of the agent after the message was sent (the prompt tokens, completition tokens, and total tokens)
  • message: A list of LettaMessage objects, generated by the agent

The LettaMessage object is a simplified version of the Message object stored in the database backend. Since a Message can include multiple events like a chain-of-thought and function calls, LettaMessage simplifies messages to have the following types:

  • reasoning_message: The inner monologue (chain-of-thought) of the agent
  • tool_call_message: An agent’s tool (function) call
  • tool_call_return: The result of executing an agent’s tool (function) call
  • assistant_message: An agent’s response message (direct response in current architecture, or send_message tool call in legacy architectures)
  • system_message: A system message (for example, an alert about the user logging in)
  • user_message: A user message

For more in-depth guide on the full set of Letta agent operations, check out our API reference, our extended Python SDK and TypeScript SDK examples, as well as our other cookbooks.

If you’re using a self-hosted Letta server, you should set the base URL (base_url in Python, baseUrl in TypeScript) to the Letta server’s URL (e.g. http://localhost:8283) when you create your client. See an example here.

If you’re using a self-hosted server, you can omit the token if you’re not using password protection. If you are using password protection, set your token to the password. If you’re using Letta Cloud, you should set the token to your Letta Cloud API key.

The agent’s state is always persisted, so you can retrieve an agent’s state by its ID using the GET /v1/agents/:agent_id endpoint.

You can list all agents using the GET /v1/agents/ endpoint.

To delete an agent, you can use the DELETE /v1/agents/:agent_id endpoint.