Skip to main content

Building an AI Agent with Memory

This guide walks through building a complete AI agent with persistent memory - one that remembers past conversations, recalls relevant context automatically, and has every memory backed by an on-chain Merkle proof. This is the same architecture powering the live demo at demo.veclabs.xyz.

What you’ll build

An AI agent that:
  • Stores every significant interaction as a vector memory
  • Automatically retrieves relevant past memories for each new message
  • Builds richer, more personalized responses over time
  • Has an immutable on-chain record of its memory state

Architecture

User message

      ├─► Embed message
      │         │
      │         ▼
      │   Query VecLabs
      │   (retrieve relevant memories)
      │         │
      │         ▼
      │   Build context:
      │   [system prompt]
      │   [relevant memories]
      │   [recent conversation]
      │         │
      ▼         ▼
      LLM generates response


      Embed + store new memory
      (upsert to VecLabs)


      Response to user

Complete implementation

import { SolVec } from '@veclabs/solvec';

const sv = new SolVec({ network: 'devnet' });

const DIMENSIONS = 1536;
const MAX_MEMORIES_PER_QUERY = 5;
const MEMORY_RELEVANCE_THRESHOLD = 0.72;

// One collection per agent instance
const memory = sv.collection('agent-memory', { dimensions: DIMENSIONS });

// Short-term context window (not persisted to VecLabs)
const recentMessages: Array<{ role: string; content: string }> = [];

async function chat(userMessage: string, sessionId: string): Promise<string> {
// 1. Embed the incoming message
const messageEmbedding = await embed(userMessage);

// 2. Recall relevant long-term memories
const relevantMemories = await memory.query({
vector: messageEmbedding,
topK: MAX_MEMORIES_PER_QUERY,
minScore: MEMORY_RELEVANCE_THRESHOLD,
});

// 3. Build system prompt with recalled memories
const systemPrompt = buildSystemPrompt(relevantMemories);

// 4. Add user message to short-term context
recentMessages.push({ role: 'user', content: userMessage });

// Keep last 10 messages in short-term context
const contextMessages = recentMessages.slice(-10);

// 5. Generate response
const response = await callLLM({
system: systemPrompt,
messages: contextMessages,
});

// 6. Add response to short-term context
recentMessages.push({ role: 'assistant', content: response });

// 7. Store the interaction as a long-term memory
const memoryText = `User said: "${userMessage}". I responded: "${response.slice(0, 200)}..."`;
const memoryEmbedding = await embed(memoryText);

await memory.upsert([{
id: `mem_${sessionId}_${Date.now()}`,
values: memoryEmbedding,
metadata: {
text: memoryText,
userMessage,
agentResponse: response.slice(0, 500),
sessionId,
timestamp: new Date().toISOString(),
}
}]);

return response;
}

function buildSystemPrompt(memories: any[]): string {
const base = `You are a helpful AI assistant with persistent memory.
You remember past conversations and use them to give better, more personalized responses.`;

if (memories.length === 0) return base;

const memoryContext = memories
.map(m => `- ${m.metadata.text}`)
.join('\n');

return `${base}

Relevant memories from past conversations:
${memoryContext}

Use these memories naturally in your response when relevant.
Don't explicitly say "I remember" unless it adds value.`;
}

async function embed(text: string): Promise<number[]> {
// Replace with your embedding provider
return Array(DIMENSIONS).fill(0).map(() => Math.random());
}

async function callLLM(params: { system: string; messages: any[] }): Promise<string> {
// Replace with your LLM provider (OpenAI, Anthropic, Gemini, etc.)
return "Response from LLM";
}


Try the live demo

demo.veclabs.xyz runs exactly this architecture with Gemini as the LLM and VecLabs for memory. Open it and tell it something - then start a new conversation and reference what you said before. It remembers. Every memory is stored with an on-chain Merkle proof you can inspect on Solana Explorer.

Next steps

  • Add importance scoring to memories
  • Implement memory summarization for long-running agents
  • Add memory decay for time-sensitive information
  • See the Use Cases: Agent Memory page for advanced patterns