Skip to main content

Memory Service

AG-Kit’s Memory Service provides comprehensive memory management for AI agents, supporting both short-term conversation history and long-term knowledge persistence. The service offers multiple implementations for different deployment scenarios, from development to enterprise production.

Overview

The Memory Service consists of two main components:
  • Short-term Memory: Manages conversation history, context windows, and session state
  • Long-term Memory: Stores persistent knowledge, user preferences, and extracted insights
Both components support intelligent context management, automatic token optimization, and seamless integration with AI agent frameworks.

Memory Implementations

AG-Kit provides multiple short-term memory implementations for different use cases and deployment scenarios. Each implementation supports the unified branch and summary architecture with context engineering capabilities.

Available Implementations

Quick Comparison

ImplementationStorage TypePersistenceScalabilityUse Case
InMemoryMemoryVolatileSingle InstanceDevelopment, Testing
TDAIMemoryCloudHighProduction, Enterprise
TypeORMMemoryDatabaseMedium-HighCustom Schema, Enterprise
MySQLMemoryMySQLHighMySQL-optimized Applications
MongoDBMemoryNoSQLVery HighFlexible Schema, Analytics
CloudBaseMemoryServerlessAutoChinese Market, Serverless

Basic Usage Example

import { InMemoryMemory, TDAIMemory, TypeORMMemory } from '@ag-kit/agents/storage';

// Choose implementation based on your needs
const memory = new InMemoryMemory(); // For development
// const memory = new TDAIMemory({ ... }); // For production
// const memory = new TypeORMMemory({ ... }); // For custom databases

// All implementations share the same interface
await memory.add({
  message: {
    id: 'msg-1',
    role: 'user',
    content: 'Hello, I need help with my project',
    timestamp: new Date()
  },
  state: { userId: 'user-123', context: 'project-help' }
});

// Common operations across all implementations
const events = await memory.list({ limit: 10, maxTokens: 4000 });
const results = await memory.retrieve('project help');
const count = await memory.getCount();
For detailed implementation guides, configuration options, and advanced features, see the individual implementation documentation linked above.

Core Data Structures

Message

id
string
required
Unique message identifier
role
'user' | 'assistant' | 'system' | 'tool'
required
Message role type
content
string
required
Message content text
timestamp
Date
Message timestamp
toolCalls
any[]
Tool calls associated with message
toolCallId
string
Tool call identifier

IMemoryEvent

message
Message
required
Message data
state
Record<string, any>
required
Event state and metadata

Context Engineering

Built-in automatic context management for optimal performance in long conversations.

Configuration Types

ContextThresholds

preRotThreshold
number
default:"150_000"
Threshold where performance starts degrading
compactionTrigger
number
default:"0.8"
Trigger compaction at this percentage of preRotThreshold
summarizationTrigger
number
default:"0.95"
Trigger summarization at this percentage of preRotThreshold

StructuredSummary

modifiedFiles
string[]
Files that were changed
userGoal
string
User’s objective
lastStep
string
Last completed step
keyDecisions
string[]
Important decisions made
criticalContext
Record<string, any>
Key context to preserve
timestamp
Date
When summary was created
preSummaryDump
IMemoryEvent[]
Optional context dump for recovery

Common Features

All memory implementations share the following core capabilities:

Context Engineering

Automatic context management with intelligent compression and summarization:
// Configure context thresholds for automatic management
const memory = new SomeMemory({
  enableContextManagement: true,
  thresholds: {
    preRotThreshold: 8000,        // Start management at 8000 tokens
    compactionTrigger: 0.8,       // Compact at 80% of threshold
    summarizationTrigger: 0.95,   // Summarize at 95% of threshold
    recentToKeep: 10              // Always keep last 10 messages
  }
});

// Context is automatically managed during operations
await memory.add(event); // Auto-triggers compression if needed

Token Management

Built-in token counting and trimming with multiple strategies:
import { TokenTrimmer, TiktokenTokenizer } from '@ag-kit/agents/storage';

// Create tokenizer and trimmer
const tokenizer = new TiktokenTokenizer('gpt-4');
const trimmer = new TokenTrimmer(tokenizer);

// Count tokens
const tokenCount = tokenizer.countTokens('Hello, world!');

// Trim messages to fit context window
const events = await memory.list();
const trimmed = trimmer.trimMessages(events, 4000, 'newest_first');
Similarity search with relevance scoring across all implementations:
// Semantic search for relevant memories
const results = await memory.retrieve('project help', {
  limit: 5,
  threshold: 0.7,
  sessionId: 'user-session'
});

// Results include relevance scores
results.forEach(event => {
  console.log(`Score: ${event.score}, Content: ${event.message.content}`);
});

Session Management

Multi-session support with isolated conversation contexts:
// Work with different sessions
await memory.add(event1, { sessionId: 'session-A' });
await memory.add(event2, { sessionId: 'session-B' });

// Session utilities
const sessionIds = memory.getSessionIds?.() || [];
const count = await memory.getCount({ sessionId: 'session-A' });
await memory.clear({ sessionId: 'session-A' });

Session Branching

Session branching enables Git-style conversation experimentation and time-travel capabilities. Create experimental conversation paths, test different responses, and rollback to previous states.

Key Features

  • Conversation Experimentation: Try different paths without losing the original
  • Time Travel: Checkout to any previous event and continue from there
  • A/B Testing: Compare different agent responses in parallel branches
  • Undo/Rollback: Easily revert unwanted changes

Basic Operations

// Create and switch to a new branch
await memory.branch('experiment-1');
await memory.checkout('experiment-1');

// List all branches
const branches = await memory.listBranches();

// Time travel to a specific event
await memory.checkout('msg-5', { type: 'event' });

// Clean up unused branches
await memory.cleanupBranches();

Implementation Support

ImplementationBranching Support
InMemoryMemory✅ Full Support
TDAIMemory❌ Not Available
TypeORMMemory✅ Full Support
MySQLMemory✅ Full Support
MongoDBMemory✅ Full Support
CloudBaseMemory✅ Full Support
For detailed branching operations and examples, see the individual implementation documentation.

Token Management

AG-Kit provides comprehensive token management utilities for handling LLM context windows and optimizing memory usage.

Key Features

  • Accurate Tokenization: TiktokenTokenizer for precise OpenAI model token counting
  • Smart Trimming: Automatic message trimming with multiple strategies
  • Context Management: Built-in integration with all memory implementations
  • Performance Optimization: Efficient token operations with resource management

Basic Usage

import { TokenTrimmer, TiktokenTokenizer } from '@ag-kit/agents/storage';

// Create tokenizer and trimmer
const tokenizer = new TiktokenTokenizer('gpt-4');
const trimmer = new TokenTrimmer(tokenizer);

// Count tokens
const tokenCount = tokenizer.countTokens('Hello, world!');

// Trim messages to fit context window
const events = await memory.list();
const trimmed = trimmer.trimMessages(events, 4000, 'newest_first');

Long-Term Memory

Long-term memory intelligently extracts and stores important information from conversations to build persistent user profiles, preferences, and knowledge bases.

Key Features

  • Intelligent Extraction: Automatically identifies important information
  • Semantic Search: Advanced similarity-based retrieval
  • Memory Consolidation: Deduplication and relationship mapping
  • Strategy-based Organization: Categorize memories by type and purpose

Available Implementations

Basic Usage

import { Mem0LongTermMemory } from '@ag-kit/agents/storage';

const longTermMemory = new Mem0LongTermMemory({
  apiKey: 'your-mem0-api-key',
  userId: 'user-123'
});

// Record memory
await longTermMemory.record({
  id: 'mem-1',
  strategy: 'user_preferences',
  content: 'User prefers dark mode and minimal UI',
  metadata: { confidence: 0.9 },
  createdAt: new Date()
});

// Semantic search
const results = await longTermMemory.semanticSearch('UI preferences');
For detailed long-term memory documentation, see the Long-term Memory Guide.

Core Data Structures

interface MemoryEntity {
  id: string;
  strategy: string;                    // Memory strategy/category for organization
  role?: 'user' | 'assistant';         // Optional role information
  content: string;                     // Memory content text
  metadata: Record<string, any>;       // Additional context and metadata
  createdAt: Date;                     // Creation timestamp
  updatedAt?: Date;                    // Last update timestamp
}

interface MemoryQuery {
  query?: string;                      // Semantic query text for search
  strategy?: string | string[];        // Strategy filter(s)
  limit?: number;                      // Maximum results to return
  offset?: number;                     // Pagination offset
  threshold?: number;                  // Similarity threshold (0-1)
  filters?: Record<string, any>;       // Additional filters
  orderBy?: Record<string, 'asc' | 'desc'>; // Sorting options
  [key: string]: any;                  // Extensible for implementation-specific options
}

Base Interface

abstract class BaseLongTermMemory {
  // Core CRUD operations
  abstract record(memory: MemoryEntity): Promise<void>;
  abstract recordBatch(memories: MemoryEntity[]): Promise<void>;
  abstract retrieve(query: MemoryQuery): Promise<MemoryEntity[]>;
  abstract delete(memoryId: string | MemoryQuery): Promise<void>;
  abstract update(memoryId: string, updates: Partial<MemoryEntity>): Promise<void>;
  abstract clear(strategy?: string): Promise<void>;

  // Advanced features
  abstract extractAndRecord(messages: Message[], context: Record<string, any>): Promise<MemoryEntity[]>;
  abstract semanticSearch(query: string, options?: MemoryQuery): Promise<MemoryEntity[]>;
  abstract getRelatedMemories(memoryId: string, depth?: number): Promise<MemoryEntity[]>;
  abstract consolidate(): Promise<void>;
}

API Reference

BaseMemory (Abstract Class)

The foundation class for all memory implementations with built-in context engineering. All memory implementations inherit from this class and provide consistent interface methods.

Constructor

config
BaseMemoryConfig
Configuration for memory and context engineering

Core Interface Methods

All memory implementations inherit these methods from BaseMemory for consistent memory management across different storage backends.
Basic Operations
list(options?): Promise<IMemoryEvent[]>
Retrieve events with filtering and pagination support.
options
ListOptions
Optional filtering and pagination parameters
Promise<IMemoryEvent[]>
Promise<IMemoryEvent[]>
required
Array of memory events matching the criteria
add(event, options?): Promise<void>
Add a single event to memory storage.
event
IMemoryEvent
required
Memory event to store
options
AddOptions
Optional parameters for event storage
addList(events, options?): Promise<void>
Add multiple events efficiently in a batch operation.
events
IMemoryEvent[]
required
Array of memory events to store
options
AddOptions
Optional parameters for batch storage
delete(idOrIndex, options?): Promise<void>
Delete a specific event from memory.
idOrIndex
string | number
required
Event ID or index to delete
options
DeleteOptions
Optional parameters for deletion
retrieve(query, options?): Promise<IMemoryEvent[]>
Search events by content using similarity matching.
query
string
required
Search query text for content matching
options
RetrieveOptions
Optional search parameters
Promise<IMemoryEvent[]>
Promise<IMemoryEvent[]>
required
Array of matching events with relevance scores
clear(options?): Promise<void>
Clear events from storage with optional filtering.
options
ClearOptions
Optional parameters for selective clearing
getCount(options?): Promise<number>
Get the total number of events in storage.
options
CountOptions
Optional parameters for counting
Promise<number>
Promise<number>
required
Total number of events matching the criteria
isEmpty(options?): Promise<boolean>
Check if memory storage is empty.
options
CountOptions
Optional parameters for emptiness check
Promise<boolean>
Promise<boolean>
required
True if no events exist matching the criteria
Branching Methods (where supported)
Available in implementations that support conversation branching (InMemoryMemory, TypeORMMemory, MySQLMemory, MongoDBMemory, CloudBaseMemory).
branch(name, fromEventId?): Promise<void>
Create a new conversation branch for experimentation.
name
string
required
Unique name for the new branch
fromEventId
string
Event ID to branch from (defaults to current HEAD)
checkout(target, options?): Promise<void>
Switch to a different branch or checkout a specific event.
target
string
required
Branch name or event ID to checkout
options
CheckoutOptions
Optional checkout parameters
listBranches(): Promise<string[]>
List all available branches in the memory storage.
Promise<string[]>
Promise<string[]>
required
Array of branch names
deleteBranch(name): Promise<void>
Delete a specific branch and all its events.
name
string
required
Name of the branch to delete
cleanupBranches(keep?): Promise<void>
Remove unused branches to optimize storage.
keep
string[]
Array of branch names to preserve (defaults to current branch)

Context Engineering Methods

isCompacted(event: IMemoryEvent): boolean
Check if an event is compacted.
event
IMemoryEvent
required
Event to check for compaction
boolean
boolean
required
True if event has compaction metadata
decompressEvent(event: IMemoryEvent): IMemoryEvent
Decompress a compacted event to restore original content.
event
IMemoryEvent
required
Compacted event to decompress
IMemoryEvent
IMemoryEvent
required
Decompressed event with original content and state
decompressEvents(events: IMemoryEvent[]): IMemoryEvent[]
Decompress multiple compacted events.
events
IMemoryEvent[]
required
Array of events to decompress
IMemoryEvent[]
IMemoryEvent[]
required
Array of decompressed events
getMetrics(): object
Get performance metrics for context management operations.
object
object
required
Metrics object containing performance statistics

Long-Term Memory API

Mem0LongTermMemory

Constructor
constructor(config: Mem0LongTermMemoryConfig)
config
Mem0LongTermMemoryConfig
required
Mem0 configuration object
Methods
record(memory: MemoryEntity): Promise<void>
Records a new memory entity to Mem0 service with intelligent processing.
memory
MemoryEntity
required
Complete memory entity with content and metadata
semanticSearch(query: string, options?: MemoryQuery): Promise<MemoryEntity[]>
Performs advanced semantic search using Mem0’s AI-powered similarity matching.
query
string
required
Search query text for semantic matching
options
MemoryQuery
Search configuration options
Promise<MemoryEntity[]>
Promise<MemoryEntity[]>
required
Array of matching memories with relevance scores
extractAndRecord(messages: Message[], context: Record<string, any>): Promise<MemoryEntity[]>
Automatically extracts and records important memories from conversation messages using AI.
messages
Message[]
required
Array of conversation messages to analyze
context
Record<string, any>
required
Additional context for memory extraction
Promise<MemoryEntity[]>
Promise<MemoryEntity[]>
required
Array of extracted and recorded memories

Token Management API

TiktokenTokenizer

Constructor
constructor(encodingName?: TiktokenEncoding)
encodingName
TiktokenEncoding
default:"o200k_base"
Tiktoken encoding name. Supported encodings: 'o200k_base', 'cl100k_base', 'p50k_base', 'r50k_base', 'gpt2'
Methods
encode(text: string): Uint32Array
Encodes text into token array using tiktoken.
text
string
required
Text to encode into tokens
Uint32Array
Uint32Array
required
Array of token IDs
decode(tokens: Uint32Array): string
Decodes tokens back to text.
tokens
Uint32Array
required
Array of token IDs to decode
string
string
required
Decoded text string
countTokens(text: string): number
Counts the number of tokens in text.
text
string
required
Text to count tokens for
number
number
required
Number of tokens in the text
free(): void
Frees the encoding resources.

TokenTrimmer

Constructor
constructor(tokenizer?: ITokenizer)
Methods
countMessageTokens(message: { role: string; content: string }): number
Calculates token count for a message including formatting overhead. Parameters:
message
{ role: string; content: string }
required
Message object to count tokens for
Returns:
number
number
required
Total token count including role and formatting overhead
trimMessages<T>(events: T[], maxTokens: number, strategy?: 'newest_first' | 'oldest_first'): T[]
Trims message list to fit within token limits using specified strategy. Parameters:
events
T[]
required
Array of events/messages to trim
maxTokens
number
required
Maximum token limit for the trimmed list
strategy
'newest_first' | 'oldest_first'
default:"newest_first"
Trimming strategy - which messages to keep when exceeding limits
Returns:
T[]
T[]
required
Trimmed array of events within token limit
Trimming behavior:
  • newest_first: Keeps most recent messages, trims older ones
  • oldest_first: Keeps earliest messages, trims newer ones
  • Single messages exceeding limit are truncated with ”…” suffix
  • Accounts for ChatML formatting overhead (4 tokens per message)
free(): void
Frees tokenizer resources. Returns:
void
void
required
No return value

SimpleTokenizer

Constructor
constructor(tokensPerChar?: number)
tokensPerChar
number
default:"0.25"
Estimated tokens per character ratio
Methods
encode(text: string): Uint32Array
Simple character-based encoding (each character becomes a token). Parameters:
text
string
required
Text to encode
Returns:
Uint32Array
Uint32Array
required
Array of character codes as tokens
decode(tokens: Uint32Array): string
Decodes character-based tokens back to text. Parameters:
tokens
Uint32Array
required
Token array to decode
Returns:
string
string
required
Decoded text string
countTokens(text: string): number
Estimates token count based on character length and ratio. Parameters:
text
string
required
Text to count tokens for
Returns:
number
number
required
Estimated number of tokens

Complete Usage Examples

Multi-Session Chat Application

import { InMemoryMemory, Mem0LongTermMemory } from '@ag-kit/agents/storage';

class ChatApplication {
  private shortTermMemory: InMemoryMemory;
  private longTermMemory: Mem0LongTermMemory;

  constructor() {
    this.shortTermMemory = new InMemoryMemory();
    this.longTermMemory = new Mem0LongTermMemory({
      apiKey: process.env.MEM0_API_KEY!,
      userId: 'default-user'
    });
  }

  async handleMessage(sessionId: string, userId: string, message: string) {
    // Store in short-term memory
    await this.shortTermMemory.add({
      message: {
        id: `msg-${Date.now()}`,
        role: 'user',
        content: message,
        timestamp: new Date()
      },
      state: { userId, sessionId, context: 'chat' }
    }, { sessionId });

    // Get conversation context (last 10 messages, max 4000 tokens)
    const context = await this.shortTermMemory.list({
      sessionId,
      limit: 10,
      order: 'desc',
      maxTokens: 4000
    });

    // Search long-term memory for relevant information
    const relevantMemories = await this.longTermMemory.semanticSearch(message, {
      threshold: 0.7,
      limit: 5
    });

    return context;
  }
}

Basic Operations

// Create and switch to a new branch
await memory.branch('experiment-1');
await memory.checkout('experiment-1');

// List all branches
const branches = await memory.listBranches();

// Time travel to a specific event
await memory.checkout('msg-5', { type: 'event' });

// Clean up unused branches
await memory.cleanupBranches();

Next Steps