Searching & Querying

Building Agents

Memory

Search result format

Writing effective queries

Letta uses OpenAI’s text-embedding-3-small model, which handles natural language questions well. Agents can use various query styles:

Natural language questions work best:

# What the agent does (agent tool call)
archival_memory_search(query="How does the test work?")
# Returns: "The Voight-Kampff test measures involuntary emotional responses..."

Keywords also work:

# What the agent does (agent tool call)
archival_memory_search(query="replicant lifespan")
# Returns memories containing both keywords and semantically related concepts

Concept-based queries leverage semantic understanding:

# What the agent does (agent tool call)
archival_memory_search(query="artificial memories")
# Returns: "...experimental replicant with implanted memories..."
# (semantic match despite different terminology)

Filtering by time

Agents can search by date ranges:

# What the agent does (agent tool call)

# Recent memories
archival_memory_search(
    query="test results",
    start_datetime="2025-09-29T00:00:00"
)

# Specific time window
archival_memory_search(
    query="replicant cases",
    start_datetime="2025-09-29T00:00:00",
    end_datetime="2025-09-30T23:59:59"
)

Tags and organization

Tags help agents organize and filter archival memories. Agents always know what tags exist in their archive since tag lists are compiled into the context window.

Common tag patterns:

user_info, professional, personal_history
documentation, technical, reference
conversation, milestone, event
company_policy, procedure, guideline

Tag search modes:

Match any tag
Match all tags
Filter by date ranges

Example of organized tagging:

# What the agent does (agent tool call)

# Atomic memory with precise tags
archival_memory_insert(
    content="Nexus-6 replicants have a four-year lifespan",
    tags=["technical", "replicant", "nexus-6"]
)

# Later, easy retrieval
archival_memory_search(
    query="how long do replicants live",
    tags=["technical"]
)

Performance and scale

Embedding models and search quality

Archival search quality depends on the agent’s embedding model:

Letta Cloud: All agents use text-embedding-3-small, which is optimized for most use cases. This model cannot be changed.

Self-hosted: Embedding model is pinned to the agent at creation. The default text-embedding-3-small is sufficient for nearly all use cases.

Changing embedding models (self-hosted only)

To change an agent’s embedding model, you must:

List and export all archival memories
Delete all archival memories
Update the agent’s embedding model
Re-insert all memories (they’ll be re-embedded)

Programmatic access (SDK)

Developers can manage archival memory programmatically via the SDK:

TypeScript
Python

// Insert a memory
await client.agents.passages.insert(agent.id, {
  content:
    "The Voight-Kampff test requires a minimum of 20 cross-referenced questions",
  tags: ["technical", "testing", "protocol"],
});

// Search memories
const results = await client.agents.passages.search(agent.id, {
  query: "testing procedures",
  tags: ["protocol"],
  page: 0,
});

// List all memories
const passages = await client.agents.passages.list(agent.id, {
  limit: 100,
});

// Get a specific memory
const passage = await client.agents.passages.get(agent.id, passageId);

# Insert a memory
client.agents.passages.insert(
    agent_id=agent.id,
    content="The Voight-Kampff test requires a minimum of 20 cross-referenced questions",
    tags=["technical", "testing", "protocol"]
)

# Search memories
results = client.agents.passages.search(
    agent_id=agent.id,
    query="testing procedures",
    tags=["protocol"],
    page=0
)

# List all memories
passages = client.agents.passages.list(
    agent_id=agent.id,
    limit=100
)

# Get a specific memory
passage = client.agents.passages.get(
    agent_id=agent.id,
    passage_id=passage_id
)

Next steps

Best Practices Learn patterns, pitfalls, and advanced usage

Archival Memory Overview Back to archival memory overview