Skip to content
  • Auto
  • Light
  • Dark
DiscordForumGitHubSign up
Building on the Letta API
View as Markdown
Copy Markdown

Open in Claude
Open in ChatGPT

Long-Running Executions

When agents need to execute multiple tool calls or perform complex operations (like deep research, data analysis, or multi-step workflows), processing time can vary significantly.

Letta supports various ways to handle long-running agents, so you can choose the approach that best fits your use case:

Use CaseDurationRecommendedationKey Benefits
Few-step invocations< 1 minuteStandard streamingSimplest approach
Variable length runs1-10 minutesBackground mode (Keepalive + Timeout as a second choice)Easy way to reduce timeouts
Deep research10+ minutesBackground mode, or async pollingSurvives disconnects, resumable streams
Batch jobsAnyAsync pollingFire-and-forget, check results later

Option 1: Background Mode with Resumable Streaming

Section titled “Option 1: Background Mode with Resumable Streaming”

Background mode decouples agent execution from your client connection. The agent processes your request on the server while streaming results to a persistent store, allowing you to reconnect and resume from any point — even if your application crashes or network fails.

Terminal window
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages/stream \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "user",
"content": "Run comprehensive analysis on this dataset"
}
],
"stream_tokens": true,
"background": true
}'
# Response stream includes run_id and seq_id for each chunk:
data: {"run_id":"run-123","seq_id":0,"message_type":"reasoning_message","reasoning":"Analyzing"}
data: {"run_id":"run-123","seq_id":1,"message_type":"reasoning_message","reasoning":" the dataset"}
data: {"run_id":"run-123","seq_id":2,"message_type":"tool_call","tool_call":{...}}
# ... stream continues
# Step 2: If disconnected, resume from last received seq_id
curl --request GET \
--url https://api.letta.com/v1/runs/$RUN_ID/stream \
--header 'Accept: text/event-stream' \
--data '{
"starting_after": 57
}'

When Human‑in‑the‑Loop (HITL) approval is enabled for a tool, your background stream may pause and emit an approval_request_message. In background mode, send the approval via a separate background stream and capture that stream’s run_id/seq_id.

Terminal window
# 1) Start background stream; capture approval request
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages/stream \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [{"role": "user", "content": "Do a sensitive operation"}],
"stream_tokens": true,
"background": true
}'
# Example stream output (approval request arrives):
data: {"run_id":"run-abc","seq_id":0,"message_type":"reasoning_message","reasoning":"..."}
data: {"run_id":"run-abc","seq_id":1,"message_type":"approval_request_message","id":"message-abc","tool_call":{"name":"sensitive_operation","arguments":"{...}","tool_call_id":"tool-xyz"}}
# 2) Approve in background; capture approval stream cursor (this creates a new run)
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages/stream \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [{"type": "approval", "approve": true, "approval_request_id": "message-abc"}],
"stream_tokens": true,
"background": true
}'
# Example approval stream output (tool result arrives here):
data: {"run_id":"run-new","seq_id":0,"message_type":"tool_return_message","status":"success","tool_return":"..."}
# 3) Resume the approval stream's run to continue
curl --request GET \
--url https://api.letta.com/v1/runs/$RUN_ID/stream \
--header 'Accept: text/event-stream' \
--data '{
"starting_after": 0
}'

When your application starts or recovers from a crash, you can check for any active background streams and resume them. This is particularly useful for:

  • Application restarts: Resume processing after deployments or crashes
  • Load balancing: Pick up streams started by other instances
  • Monitoring: Check progress of long-running operations from different clients
Terminal window
# Step 1: Find active background streams for your agents
curl --request GET \
--url https://api.letta.com/v1/runs/active \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"agent_ids": [
"agent-123",
"agent-456"
],
"background": true
}'
# Returns: [{"run_id": "run-abc", "agent_id": "agent-123", "status": "processing", ...}]
# Step 2: Resume streaming from the beginning (or any specified seq_id)
curl --request GET \
--url https://api.letta.com/v1/runs/$RUN_ID/stream \
--header 'Accept: text/event-stream' \
--data '{
"starting_after": 0, # Start from beginning
"batch_size": 1000 # Fetch historical chunks in larger batches
}'

Ideal for batch processing, scheduled jobs, or when you don’t need real-time updates. The async SDK method queues your request and returns immediately, letting you check results later:

Terminal window
# Start async operation (returns immediately with run ID)
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages/async \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "user",
"content": "Run comprehensive analysis on this dataset"
}
]
}'
# Poll for results using the returned run ID
curl --request GET \
--url https://api.letta.com/v1/runs/$RUN_ID

Option 3: Configure Streaming with Keepalive Pings and Longer Timeouts

Section titled “Option 3: Configure Streaming with Keepalive Pings and Longer Timeouts”

For operations under 10 minutes that need real-time updates without the complexity of background processing. Configure keepalive pings and timeouts to maintain stable connections:

Terminal window
curl --request POST \
--url https://api.letta.com/v1/agents/$AGENT_ID/messages/stream \
--header 'Authorization: Bearer $LETTA_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "user",
"content": "Execute this long-running analysis"
}
],
"include_pings": true
}'
ParameterPurposeWhen to Use
Timeout in secondsExtends request timeout beyond 60s defaultSet to 1.5x your expected max duration
Include pingsSends keepalive messages every ~30sEnable for operations with long gaps between outputs