The Context Degradation Problem
As AI conversations grow longer, they face a fundamental challenge: context degradation. This manifests as:
- Performance Degradation: Token limits, attention dilution, processing latency, cost escalation
- Information Overload: Signal vs noise, recency bias, context switching, memory fragmentation
The AG-Kit Solution
AG-Kit implements a three-tier context management strategy inspired by Manus’s research:
Context Size
│
│ ┌─────────────┐
│ │ Hard Limit │ (1M tokens)
│ └─────────────┘
│ ┌─────────────┐
│ │ Pre-rot │ (150K tokens)
│ │ Threshold │
│ └─────────────┘
│ ┌──────────┐
│ │ Summ. │ (142.5K = 95%)
│ │ Trigger │
│ └──────────┘
│ ┌──────────┐
│ │ Compact. │ (120K = 80%)
│ │ Trigger │
│ └──────────┘
│ ┌──────────┐
│ │ Normal │
│ │ Operation│
└────────────┴──────────┴──────────────────────────────────────>
Time
Three-Tier Strategy
- 🟢 Normal Operation (0-80%): Store all messages in full detail
- 🟡 Reversible Compaction (80-95%): Compress old messages while preserving reconstruction ability
- 🔴 Structured Summarization (95%+): Create structured summaries for dramatic token reduction
How It Works
Reversible Compaction (🟡 Tier 2)
Compaction stores the full original content in event.state.__compaction__ while replacing the message content with a compact reference:
// Original Event
{
"message": {
"id": "msg-1",
"content": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
"role": "user"
},
"state": { "userGoal": "implement auth", "complexity": "medium" }
}
// After Compaction
{
"message": {
"id": "msg-1",
"content": "[COMPACTED: User auth implementation request]",
"role": "user"
},
"state": {
"userGoal": "implement auth",
"complexity": "medium",
"__compaction__": {
"originalContent": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
"tokensSaved": 45,
}
}
}
Structured Summarization (🔴 Tier 3)
Instead of free-form text summaries, the system creates structured summaries with specific fields:
StructuredSummary Interface
interface StructuredSummary {
count: number;
timeRange: { start?: Date; end?: Date };
timestamp: Date;
}
{
count: 1200,
timeRange: {
start: new Date('2024-01-15T10:00:00Z'),
end: new Date('2024-01-15T14:30:00Z')
},
timestamp: new Date('2024-01-15T14:30:00Z')
}
Basic Setup
from agkit.agents.storage import InMemoryMemory, TDAIMemory
# Default context engineering (recommended)
memory = InMemoryMemory()
# Disable automatic context management
manual_memory = InMemoryMemory(
enable_context_management=False # Disable automatic context management
)
# Custom configuration with context engineering
memory = InMemoryMemory(
enable_context_management=True, # Enable automatic context management (default)
thresholds={
'pre_rot_threshold': 150_000,
'compaction_trigger': 0.8,
'summarization_trigger': 0.95,
'keep_recent_count': 10,
}
)
Manual Context Management
When enableContextManagement is disabled, you can manually trigger context management:
memory = InMemoryMemory(
enable_context_management=False # Disable automatic management
)
# Add events without automatic context management
await memory.add(event1, session_id='session-123')
await memory.add(event2, session_id='session-123')
await memory.add(event3, session_id='session-123')
# Manually trigger context management when needed
await memory.manage_context('session-123')
Custom Memory Implementation
To implement custom context engineering logic, extend BaseMemory and override the compaction and summarization methods:
from agkit.agents.storage import BaseMemory, IMemoryEvent, StructuredSummary
from datetime import datetime
from typing import List
# Method 1: Pass summarizer in config
async def custom_summarizer(events: List[IMemoryEvent]) -> StructuredSummary:
# Custom summarization logic
return {
'count': len(events),
'time_range': {
'start': events[0]['message']['timestamp'],
'end': events[-1]['message']['timestamp']
},
'timestamp': datetime.now()
}
custom_memory = BaseMemory(
summarizer=custom_summarizer,
thresholds={
'pre_rot_threshold': 100_000,
'compaction_trigger': 0.8,
'summarization_trigger': 0.95
}
)
# Method 2: Extend BaseMemory for advanced customization
class CustomMemory(BaseMemory):
def __init__(self, config=None):
super().__init__({
**(config or {}),
'summarizer': self._custom_summarizer
})
# Override compaction logic for individual events
async def _compact_event(self, event: IMemoryEvent) -> None:
# Custom compaction logic - example: compress code blocks
if '```' in event['message']['content']:
event['state']['__compaction__'] = {
'original_content': event['message']['content'],
'compacted_at': datetime.now().isoformat(),
'tokens_saved': /* calculate tokens saved */,
'strategy': 'code_compression'
}
event['message']['content'] = '[COMPACTED: Code discussion]'
else:
await super()._compact_event(event)
# Custom summarization function
async def _custom_summarizer(self, events: List[IMemoryEvent]) -> StructuredSummary:
return {
'count': len(events),
'time_range': {
'start': events[0]['message']['timestamp'],
'end': events[-1]['message']['timestamp']
},
'timestamp': datetime.now()
}
# Override storage methods (optional)
async def _store_summary(self, session_id: str, summary: StructuredSummary) -> None:
# Custom storage logic
print(f'Storing summary for {session_id}')
async def _clear_summarized_events(self, session_id: str, keep_recent_count: int) -> None:
# Custom cleanup logic
print(f'Cleaning up events for {session_id}')
await super()._clear_summarized_events(session_id, keep_recent_count)
# Trigger context management after adding events
async def add(self, event: IMemoryEvent) -> None:
await super().add(event)
session_id = event['state'].get('session_id', 'default')
await self._manage_context(session_id)
Key Implementation Points
-
Context Management Control:
- Automatic (default): Set
enableContextManagement: true for automatic context management after each add
- Manual: Set
enableContextManagement: false and call manageContext(sessionId) manually
-
Two Customization Approaches:
- Simple: Pass
summarizer function in config when creating BaseMemory
- Advanced: Extend BaseMemory class for full customization
-
Custom Summarizer: Provide
config.summarizer function that takes events and returns StructuredSummary
-
Override
compactEvent(): Implement custom logic for reversible compaction of individual events
-
Override Storage Methods (optional): Define
storeSummary() and clearSummarizedEvents() for custom storage
-
Preserve Metadata: Store compaction metadata in
event.state.__compaction__
Automatic vs Manual: By default, both InMemoryMemory and TDAIMemory automatically trigger context management after adding events. Set enableContextManagement: false if you prefer to control when context management occurs, then call manageContext(sessionId) manually when needed.
Real-World Example: Long Debugging Session
Here’s how context engineering works in a real debugging session that spans 200+ messages:
# Initial problem (Normal Operation - Tier 1)
await memory.add({
'message': {
'role': 'user',
'content': 'Our production database is timing out after 30 seconds, affecting 1200 users'
},
'state': {
'issue': 'db_timeout',
'severity': 'critical',
'environment': 'production'
}
})
# After 50 messages, context reaches 120K tokens
# 🟡 System automatically triggers Tier 2 (Compaction)
compacted_event = {
'message': {'content': '[COMPACTED: Database timeout issue report]'},
'state': {
'issue': 'db_timeout',
'severity': 'critical',
'__compaction__': {
'original_content': 'Our production database is timing out after 30 seconds...',
'tokens_saved': 23,
'compacted_at': '2024-01-15T10:30:00Z'
}
}
}
# After 200 messages, context reaches 142.5K tokens
# 🔴 System triggers Tier 3 (Structured Summarization)
summary = {
'modified_files': ['database.py', 'user_service.py', 'auth_middleware.py'],
'user_goal': 'Fix database connection timeout in production',
'last_step': 'Updated connection pool configuration',
'key_decisions': [
'Increased pool size from 10 to 50',
'Added connection retry logic',
'Implemented circuit breaker pattern'
],
'critical_context': {
'error_pattern': 'Connection timeout after 30s',
'environment': 'production',
'affected_users': 1200,
'urgency': 'critical'
}
}
Result: The conversation can continue indefinitely while maintaining:
- ✅ Performance: Response times remain fast
- ✅ Context: Critical debugging information preserved
- ✅ Cost: Token usage reduced by 70-80%
- ✅ Quality: Recent context maintains conversation flow
Key Benefits
Context engineering solves the fundamental challenge of maintaining conversation quality as contexts grow:
🚀 Unlimited Conversation Length
- No practical limits on conversation duration
- Conversations can span hundreds or thousands of messages
- System automatically manages context without manual intervention
📈 Maintained Performance
- Response times remain fast even in long conversations
- Quality doesn’t degrade as context fills up
- Attention remains focused on relevant information
- Reversible compaction: Full recovery of compressed content
- Structured summaries: Key insights preserved in organized format
- Recent context: Always maintains conversation flow
💰 Cost Efficiency
- Dramatic reduction in token usage (70-80% savings typical)
- Lower API costs for long conversations
- Efficient resource utilization
🔧 Zero Configuration
- Works out of the box with sensible defaults
- Automatic triggers based on configurable thresholds
- Progressive strategy optimizes for information preservation
The key insight from Manus’s research: Not all context degradation is equal. By applying the right strategy at the right time (reversible compaction before irreversible summarization), we maintain conversation coherence while managing resource constraints effectively.