Searching & Querying
Search result format
Section titled “Search result format”Writing effective queries
Section titled “Writing effective queries”Letta uses OpenAI’s text-embedding-3-small model, which handles natural language questions well. Agents can use various query styles:
Natural language questions work best:
# What the agent does (agent tool call)archival_memory_search(query="How does the test work?")# Returns: "The Voight-Kampff test measures involuntary emotional responses..."Keywords also work:
# What the agent does (agent tool call)archival_memory_search(query="replicant lifespan")# Returns memories containing both keywords and semantically related conceptsConcept-based queries leverage semantic understanding:
# What the agent does (agent tool call)archival_memory_search(query="artificial memories")# Returns: "...experimental replicant with implanted memories..."# (semantic match despite different terminology)Filtering by time
Section titled “Filtering by time”Agents can search by date ranges:
# What the agent does (agent tool call)
# Recent memoriesarchival_memory_search( query="test results", start_datetime="2025-09-29T00:00:00")
# Specific time windowarchival_memory_search( query="replicant cases", start_datetime="2025-09-29T00:00:00", end_datetime="2025-09-30T23:59:59")Tags and organization
Section titled “Tags and organization”Tags help agents organize and filter archival memories. Agents always know what tags exist in their archive since tag lists are compiled into the context window.
Common tag patterns:
user_info,professional,personal_historydocumentation,technical,referenceconversation,milestone,eventcompany_policy,procedure,guideline
Tag search modes:
- Match any tag
- Match all tags
- Filter by date ranges
Example of organized tagging:
# What the agent does (agent tool call)
# Atomic memory with precise tagsarchival_memory_insert( content="Nexus-6 replicants have a four-year lifespan", tags=["technical", "replicant", "nexus-6"])
# Later, easy retrievalarchival_memory_search( query="how long do replicants live", tags=["technical"])Performance and scale
Section titled “Performance and scale”Embedding models and search quality
Section titled “Embedding models and search quality”Archival search quality depends on the agent’s embedding model:
Letta Cloud: All agents use text-embedding-3-small, which is optimized for most use cases. This model cannot be changed.
Self-hosted: Embedding model is pinned to the agent at creation. The default text-embedding-3-small is sufficient for nearly all use cases.
Changing embedding models (self-hosted only)
Section titled “Changing embedding models (self-hosted only)”To change an agent’s embedding model, you must:
- List and export all archival memories
- Delete all archival memories
- Update the agent’s embedding model
- Re-insert all memories (they’ll be re-embedded)
Programmatic access (SDK)
Section titled “Programmatic access (SDK)”Developers can manage archival memory programmatically via the SDK:
// Insert a memoryawait client.agents.passages.insert(agent.id, { content: "The Voight-Kampff test requires a minimum of 20 cross-referenced questions", tags: ["technical", "testing", "protocol"],});
// Search memoriesconst results = await client.agents.passages.search(agent.id, { query: "testing procedures", tags: ["protocol"], page: 0,});
// List all memoriesconst passages = await client.agents.passages.list(agent.id, { limit: 100,});
// Get a specific memoryconst passage = await client.agents.passages.get(agent.id, passageId);# Insert a memoryclient.agents.passages.insert( agent_id=agent.id, content="The Voight-Kampff test requires a minimum of 20 cross-referenced questions", tags=["technical", "testing", "protocol"])
# Search memoriesresults = client.agents.passages.search( agent_id=agent.id, query="testing procedures", tags=["protocol"], page=0)
# List all memoriespassages = client.agents.passages.list( agent_id=agent.id, limit=100)
# Get a specific memorypassage = client.agents.passages.get( agent_id=agent.id, passage_id=passage_id)