Skip to main content
Short-term memory in AG-Kit manages conversation history and context within active sessions. It provides efficient storage and retrieval of recent messages, enabling agents to maintain coherent conversations with context awareness.

Overview

Short-term memory (also called session memory or conversation memory) stores:
  • Recent conversation messages - User and assistant exchanges
  • Tool call history - Records of tool invocations and results
  • Session state - Custom metadata and context information
  • Temporal context - Time-based message ordering and filtering

Memory Implementations

AG-Kit provides multiple short-term memory implementations for different use cases and deployment scenarios:

InMemoryMemory

Volatile in-memory storage ideal for development, testing, and single-instance deployments. Features:
  • Fast read/write operations
  • Content-based similarity search
  • Token-aware message trimming
  • Multi-session support
  • Zero external dependencies
Use Cases:
  • Development and testing
  • Single-server applications
  • Temporary sessions
  • Prototyping

TDAIMemory

Cloud-based persistent storage with production-grade scalability through TDAI services. Features:
  • Persistent cloud storage
  • Advanced semantic search
  • Distributed session management
  • Optional local caching
  • Production-ready reliability
Use Cases:
  • Production deployments
  • Multi-server applications
  • Long-running sessions
  • Enterprise applications

TypeORMMemory

Flexible ORM-based storage supporting multiple databases with customizable schema. Features:
  • Multi-database support (MySQL, PostgreSQL, SQLite, etc.)
  • Custom entity definitions
  • Document conversion system
  • Branch and summary architecture
  • TypeScript type safety
  • Migration support
Use Cases:
  • Custom database schemas
  • Enterprise database integration
  • Complex data relationships
  • Type-safe development

MySQLMemory

Optimized MySQL implementation extending TypeORMMemory with MySQL-specific features. Features:
  • MySQL-specific optimizations
  • Connection pooling
  • Transaction support
  • Index optimization
  • Performance monitoring
Use Cases:
  • MySQL-based applications
  • High-performance requirements
  • Enterprise MySQL deployments

MongoDBMemory

NoSQL document storage with flexible schema and horizontal scaling capabilities. Features:
  • Document-based storage
  • Flexible schema design
  • Auto-connection setup - No manual client creation required
  • Connection string configuration
  • Horizontal scaling
  • Aggregation pipelines
  • GridFS support for large data
  • Connection pooling and options
Use Cases:
  • NoSQL applications
  • Flexible data structures
  • Horizontal scaling needs
  • Document-oriented workflows
  • Rapid prototyping with minimal setup

CloudBaseMemory

Tencent CloudBase integration for serverless cloud storage. Features:
  • Serverless architecture
  • Automatic scaling
  • Tencent Cloud integration
  • Real-time synchronization
  • Built-in security
Use Cases:
  • Tencent Cloud deployments
  • Serverless applications
  • Chinese market applications
  • Automatic scaling needs

Quick Start

Basic Usage with InMemoryMemory

import { InMemoryMemory } from '@ag-kit/agents';

// Create memory instance
const memory = new InMemoryMemory();

// Add a conversation event
await memory.add({
  message: {
    id: 'msg-1',
    role: 'user',
    content: 'What is the weather today?',
    timestamp: new Date()
  },
  state: { userId: 'user-123', location: 'San Francisco' }
});

// Add assistant response
await memory.add({
  message: {
    id: 'msg-2',
    role: 'assistant',
    content: 'The weather in San Francisco is sunny, 72°F.',
    timestamp: new Date()
  },
  state: { userId: 'user-123' }
});

// Retrieve conversation history
const events = await memory.list({ limit: 10 });
console.log(events);
// [
//   { message: { id: 'msg-1', role: 'user', content: '...' }, state: {...} },
//   { message: { id: 'msg-2', role: 'assistant', content: '...' }, state: {...} }
// ]

Using TDAIMemory for Production

import { TDAIMemory } from '@ag-kit/agents';

// Create TDAI memory instance
const memory = new TDAIMemory({
  sessionId: 'user-session-123',
  clientOptions: {
    apiKey: process.env.TDAI_API_KEY!,
    endpoint: 'https://api.tdai.example.com'
  },
  useCache: true // Enable local caching for better performance
});

// Add conversation events
await memory.add({
  message: {
    id: 'msg-1',
    role: 'user',
    content: 'Tell me about AG-Kit',
    timestamp: new Date()
  },
  state: { source: 'web-chat' }
});

// List recent messages
const recentMessages = await memory.list({
  limit: 20,
  order: 'desc' // Most recent first
});

Using TypeORMMemory with Custom Schema

from agkit.agents import TypeORMMemory
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

# Create database connection
engine = create_engine('mysql://agkit_user:password@localhost:3306/agkit_memory')
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

# Create TypeORM memory instance
memory = TypeORMMemory(
    engine=engine,
    session_factory=SessionLocal,
    session_id='user-session-123',
    event_table_name='custom_memory_events',
    state_table_name='custom_memory_state',
    summary_table_name='custom_memory_summaries',
    enable_context_management=True
)

# Use with branch and summary support
await memory.add({
    'message': {
        'id': 'msg-1',
        'role': 'user',
        'content': 'Hello with custom schema',
        'timestamp': datetime.now()
    },
    'state': {'custom_field': 'value'}
})

Using MongoDBMemory

from agkit.agents import MongoDBMemory

# Recommended: Direct configuration (no manual client setup)
memory = MongoDBMemory(
    connection_string='mongodb://localhost:27017',
    database_name='agkit_memory',
    session_id='user-session-123',
    collection_name='memory_events',
    state_collection_name='memory_state',
    summary_collection_name='memory_summaries',
    enable_context_management=True
)

# Advanced configuration with connection options
memory2 = MongoDBMemory(
    connection_string='mongodb://username:password@localhost:27017',
    database_name='agkit_memory',
    session_id='user-session-456',
    client_options={
        'maxPoolSize': 10,
        'serverSelectionTimeoutMS': 5000,
        'socketTimeoutMS': 45000,
    },
    enable_context_management=True
)

# Alternative: Use existing MongoDB connection (if needed)
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017')
db = client.agkit_memory

memory3 = MongoDBMemory(
    db=db,
    session_id='user-session-789',
    collection_name='memory_events'
)

# Add events with flexible document structure
await memory.add({
    'message': {
        'id': 'msg-1',
        'role': 'user',
        'content': 'MongoDB flexible storage',
        'timestamp': datetime.now(),
        'metadata': {
            'source': 'mobile-app',
            'version': '1.2.0'
        }
    },
    'state': {
        'user_id': 'user-123',
        'preferences': {
            'theme': 'dark',
            'language': 'en'
        }
    }
})

Using CloudBaseMemory

# CloudBase integration is primarily available in TypeScript

Unified Architecture

Branch and Summary System

All memory implementations now support a unified branch and summary architecture that enables:
  • Conversation Branching: Create alternative conversation paths for experimentation
  • Automatic Summarization: Compress long conversations while preserving context
  • Context Engineering: Intelligent token management and context window optimization (Learn more)
  • State Management: Persistent session state across branches and summaries
# Create a new branch from a specific event
branch_path = await memory.create_branch('experiment-1', 'msg-10')

# Switch to the branch (checkout)
await memory.checkout('experiment-1')

# Add messages to the current branch
await memory.add({
    'message': {
        'id': 'msg-11-alt',
        'role': 'assistant',
        'content': 'Alternative response for testing',
        'timestamp': datetime.now()
    }
})

# List all branches with metadata
branches = await memory.list_branches()
print(branches)

# Switch back to main branch
await memory.checkout('main')

Custom Entity and Document Conversion

TypeORM-based implementations support custom entity definitions and document conversion for flexible schema design:
from sqlalchemy import Column, String, JSON, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base
from agkit.agents import TypeORMMemory

Base = declarative_base()

# Define custom entity
class CustomMemoryEvent(Base):
    __tablename__ = 'custom_memory_events'
    
    id = Column(String, primary_key=True)
    session_id = Column(String, nullable=False)
    message = Column(JSON, nullable=False)
    state = Column(JSON, nullable=False)
    branch_path = Column(String, nullable=False)
    created_at = Column(DateTime, nullable=False)
    
    # Custom fields
    priority = Column(Integer, default=0)
    tags = Column(JSON, default=list)
    category = Column(String, default='general')

# Custom document converter
class CustomDocumentConverter:
    def to_document(self, event, session_id, branch_path):
        return {
            'session_id': session_id,
            'message': event['message'],
            'state': event['state'],
            'branch_path': branch_path,
            'created_at': datetime.now(),
            # Custom fields
            'priority': event['state'].get('priority', 0),
            'tags': event['state'].get('tags', []),
            'category': event['state'].get('category', 'general')
        }
    
    def from_document(self, doc):
        return {
            'message': doc['message'],
            'state': {
                **doc['state'],
                'priority': doc['priority'],
                'tags': doc['tags'],
                'category': doc['category']
            }
        }

# Use custom entity and converter
memory = TypeORMMemory(
    engine=engine,
    session_id='user-123',
    custom_entity=CustomMemoryEvent,
    document_converter=CustomDocumentConverter()
)

Core Operations

Adding Events

Add single or multiple conversation events to memory.
# Add single event
await memory.add({
    'message': {
        'id': 'msg-1',
        'role': 'user',
        'content': 'Hello, how are you?',
        'timestamp': datetime.now()
    },
    'state': {'mood': 'friendly'}
})

# Add multiple events efficiently
await memory.add_list([
    {
        'message': {'id': 'msg-1', 'role': 'user', 'content': 'First message'},
        'state': {}
    },
    {
        'message': {'id': 'msg-2', 'role': 'assistant', 'content': 'Response'},
        'state': {}
    }
])

Listing Events

Retrieve events with filtering, pagination, and token limiting.
# Get all events
all_events = await memory.list()

# Get recent 10 events
recent_events = await memory.list(limit=10)

# Get events with pagination
page2 = await memory.list(limit=10, offset=10)

# Get events in descending order (newest first)
newest_first = await memory.list(order='desc')

# Limit by token count (useful for LLM context windows)
token_limited = await memory.list(max_tokens=2000)
# Returns events that fit within 2000 tokens

Searching Events

Search for events based on content similarity.
# Search for events containing specific content
results = await memory.retrieve('weather forecast')
# Returns events sorted by relevance score

# InMemoryMemory uses content-based similarity scoring
# TDAIMemory uses semantic search capabilities

Deleting Events

Remove specific events from memory.
# Delete by message ID
await memory.delete('msg-1')

# Delete by index (InMemoryMemory only)
await memory.delete(0)  # Delete first event

# Note: TDAIMemory only supports deletion by ID

Clearing Memory

Remove all events from storage.
# Clear all events
await memory.clear()

# Check if memory is empty
is_empty = await memory.is_empty()  # True

# Get event count
count = await memory.get_count()  # 0

Multi-Session Support

Both memory implementations support multiple isolated sessions.
memory = InMemoryMemory()

# Add events to different sessions
await memory.add(
    {'message': {'id': '1', 'role': 'user', 'content': 'Hello'}, 'state': {}},
    session_id='session-A'
)

await memory.add(
    {'message': {'id': '2', 'role': 'user', 'content': 'Hi there'}, 'state': {}},
    session_id='session-B'
)

# List events from specific session
session_a_events = await memory.list(session_id='session-A')
session_b_events = await memory.list(session_id='session-B')

# Get all session IDs (InMemoryMemory only)
session_ids = memory.get_session_ids()  # ['session-A', 'session-B']

# Check if session exists (InMemoryMemory only)
has_session = memory.has_session('session-A')  # True

# Clear specific session
await memory.clear(session_id='session-A')

Token Management

AG-Kit provides automatic token counting and trimming to manage LLM context windows.
from agkit.agents import InMemoryMemory, TiktokenTokenizer

# Use default tokenizer (Tiktoken)
memory = InMemoryMemory()

# Or provide custom tokenizer
custom_tokenizer = TiktokenTokenizer('gpt-4')
memory_with_custom = InMemoryMemory(custom_tokenizer)

# Retrieve events within token limit
events = await memory.list(max_tokens=4000)
# Automatically trims oldest messages to fit within 4000 tokens

Integration with Agents

Short-term memory integrates seamlessly with AI agents for automatic context management. Learn more: Complete Agent Integration Guide

Session Branching

Session branching enables conversation experimentation and time-travel capabilities. Create experimental conversation paths, test different responses, and rollback to previous states.
Session branching is currently supported by InMemoryMemory. Other implementations will throw an error if branching methods are called.

Creating Branches

Create experimental conversation paths without losing the original:
from agkit.agents import InMemoryMemory

memory = InMemoryMemory()

# Add conversation history
await memory.add({
    'message': {'id': 'msg-1', 'role': 'user', 'content': 'Help me write an email'},
    'state': {}
})
await memory.add({
    'message': {'id': 'msg-2', 'role': 'assistant', 'content': 'I can help with that!'},
    'state': {}
})

# Create a branch
branch_id = await memory.branch('polite-version')
print(f'Created branch: {branch_id}')

# Switch to the branch
await memory.checkout('polite-version')

# Add experimental content
await memory.add({
    'message': {
        'id': 'msg-3',
        'role': 'assistant',
        'content': 'I would be delighted to assist you with composing your email!'
    },
    'state': {}
})

# List branches
branches = await memory.list_branches()
print(branches)

Time Travel with Event Checkout

Checkout to a specific event and delete all events after it:
# Add multiple messages
await memory.add({'message': {'id': 'msg-1', 'role': 'user', 'content': 'Hello'}, 'state': {}})
await memory.add({'message': {'id': 'msg-2', 'role': 'assistant', 'content': 'Hi!'}, 'state': {}})
await memory.add({'message': {'id': 'msg-3', 'role': 'user', 'content': 'How are you?'}, 'state': {}})
await memory.add({'message': {'id': 'msg-4', 'role': 'assistant', 'content': 'Good!'}, 'state': {}})

# Checkout to msg-2 (deletes msg-3 and msg-4)
await memory.checkout('msg-2', type='event')

events = await memory.list()
print(len(events))  # 2 (only msg-1 and msg-2 remain)

Advanced Branch Operations

# A/B Testing Different Responses
await memory.create_branch('friendly-agent', 'msg-5')
await memory.create_branch('professional-agent', 'msg-5')

# Test friendly version
await memory.checkout('friendly-agent')
await memory.add({
    'message': {'id': 'resp-1', 'role': 'assistant', 'content': 'Hey! Sure thing!'},
    'state': {}
})

# Test professional version
await memory.checkout('professional-agent')
await memory.add({
    'message': {'id': 'resp-2', 'role': 'assistant', 'content': 'Certainly, I can assist.'},
    'state': {}
})

# Compare results and choose the better branch
friendly_branch = await memory.list_branches()
await memory.checkout('friendly-agent')  # Keep this one

# Undo/Rollback to specific event
await memory.checkout('msg-3')  # Rollback to message ID

# Conversation Checkpoints
await memory.add(important_message1)
checkpoint1 = await memory.create_branch('checkpoint-1')

await memory.add(important_message2)
checkpoint2 = await memory.create_branch('checkpoint-2')

# Later, return to a checkpoint
await memory.checkout('checkpoint-1')

Advanced Patterns

Context Window Management

Automatically manage LLM context windows with intelligent thresholds and token limits. For comprehensive context engineering strategies, see the Context Engineering Guide.
# Strategy 1: Automatic threshold-based management (Coming Soon)
# Note: Advanced threshold management is currently available in TypeScript only
# Python implementation coming soon

# Strategy 2: Fixed token limit
memory = InMemoryMemory()
events = await memory.list(max_tokens=4000)

# Strategy 3: Dynamic token allocation
model_max_tokens = 8000
reserved_for_response = 2000
available_for_context = model_max_tokens - reserved_for_response
context_events = await memory.list(max_tokens=available_for_context)

# Strategy 4: Sliding window with recent messages
recent_events = await memory.list(
    limit=20,
    order='desc',
    max_tokens=3000
)
Context Management Strategies:
  1. Automatic Thresholds (Recommended) - Intelligent compression with configurable thresholds
  2. Fixed Token Limits - Simple hard limits using maxTokens
  3. Dynamic Token Allocation - Calculated limits based on model capacity
  4. Sliding Window - Recent messages with token constraints
How Automatic Thresholds Work:
  • Compaction Phase: At 80% of preRotThreshold → Remove redundant/duplicate content
  • Summarization Phase: At 95% of preRotThreshold → Compress older messages into summaries
  • Preservation: Recent messages (specified by recentToKeep) are always kept in full
  • Complementary: Can be used together with maxTokens for additional protection
For detailed context engineering patterns and advanced threshold strategies, see the Context Engineering Documentation.

Tool Call History

Store and retrieve tool invocation history.
# Add tool call event
await memory.add({
    'message': {
        'id': 'msg-3',
        'role': 'assistant',
        'content': 'Let me check the weather for you.',
        'tool_calls': [{
            'id': 'call-1',
            'type': 'function',
            'function': {
                'name': 'get_weather',
                'arguments': '{"location": "San Francisco"}'
            }
        }]
    },
    'state': {}
})

# Add tool result event
await memory.add({
    'message': {
        'id': 'msg-4',
        'role': 'tool',
        'content': '{"temperature": 72, "condition": "sunny"}',
        'tool_call_id': 'call-1'
    },
    'state': {}
})

Custom State Management

Store custom metadata with each event.
await memory.add({
    'message': {
        'id': 'msg-1',
        'role': 'user',
        'content': 'Book a flight to Tokyo',
    },
    'state': {
        'user_id': 'user-123',
        'intent': 'flight_booking',
        'entities': {
            'destination': 'Tokyo',
            'travel_class': 'economy'
        },
        'confidence': 0.95,
        'timestamp': datetime.now().isoformat()
    }
})

# Retrieve and access custom state
events = await memory.list()
print(events[0]['state']['intent'])  # 'flight_booking'
print(events[0]['state']['entities'])  # {'destination': 'Tokyo', ...}

Best Practices

1. Choose the Right Implementation

  • Use InMemoryMemory for development, testing, and single-instance applications
  • Use Others for production, distributed systems, and persistent storage needs

2. Manage Token Limits

Always consider LLM context window limits:
# Good: Limit tokens to fit model context
events = await memory.list(max_tokens=4000)

# Bad: Loading all events without limit
events = await memory.list()  # May exceed context window

3. Use Session IDs Consistently

Maintain session isolation for multi-user applications:
# Good: Use consistent session IDs
session_id = f"user-{user_id}-{conversation_id}"
await memory.add(event, AddOptions(session_id=session_id))

# Bad: Mixing sessions
await memory.add(event)  # Uses default session

4. Clean Up Old Sessions

Regularly clear inactive sessions to manage memory:
# Clear specific session after conversation ends
await memory.clear(ClearOptions(session_id='session-123'))

# Or clear all sessions periodically
await memory.clear()

5. Handle Errors Gracefully

try:
    await memory.add(event)
except Exception as error:
    print(f'Failed to add event: {error}')
    # Implement fallback or retry logic

Performance Considerations

InMemoryMemory

  • Fast: All operations are in-memory
  • Scalable: Handles thousands of events efficiently
  • Limitation: Data lost on process restart
  • Memory Usage: Grows with event count

TDAIMemory

  • Persistent: Data survives restarts
  • Distributed: Works across multiple servers
  • Caching: Optional local cache for better performance
  • Network: Requires network calls to TDAI service

Implementation Comparison

FeatureInMemoryTDAITypeORMMySQLMongoDBCloudBase
Storage TypeVolatileCloudDatabaseDatabaseNoSQLServerless
Persistence
PerformanceFastestFastFastFastestFastGood
ScalabilitySingleHighMediumHighVery HighAuto
Custom Schema
Branch Support
Summary Support
Multi-Database
Setup ComplexityNoneAPI KeyMediumMediumVery LowMedium
Best ForDev/TestProductionEnterpriseMySQL AppsNoSQL AppsTencent Cloud

Next Steps