Skip to main content

The Context Degradation Problem

As AI conversations grow longer, they face a fundamental challenge: context degradation. This manifests as:
  • Performance Degradation: Token limits, attention dilution, processing latency, cost escalation
  • Information Overload: Signal vs noise, recency bias, context switching, memory fragmentation

The AG-Kit Solution

AG-Kit implements a three-tier context management strategy inspired by Manus’s research:
Context Size

    │                                    ┌─────────────┐
    │                                    │ Hard Limit  │ (1M tokens)
    │                                    └─────────────┘
    │                              ┌─────────────┐
    │                              │ Pre-rot     │ (150K tokens)
    │                              │ Threshold   │
    │                              └─────────────┘
    │                        ┌──────────┐
    │                        │ Summ.    │ (142.5K = 95%)
    │                        │ Trigger  │
    │                        └──────────┘
    │                  ┌──────────┐
    │                  │ Compact. │ (120K = 80%)
    │                  │ Trigger  │
    │                  └──────────┘
    │            ┌──────────┐
    │            │ Normal   │
    │            │ Operation│
    └────────────┴──────────┴──────────────────────────────────────>
                                                                Time

Three-Tier Strategy

  1. 🟢 Normal Operation (0-80%): Store all messages in full detail
  2. 🟡 Reversible Compaction (80-95%): Compress old messages while preserving reconstruction ability
  3. 🔴 Structured Summarization (95%+): Create structured summaries for dramatic token reduction

How It Works

Reversible Compaction (🟡 Tier 2)

Compaction stores the full original content in event.state.__compaction__ while replacing the message content with a compact reference:
// Original Event
{
  "message": {
    "id": "msg-1",
    "content": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
    "role": "user"
  },
  "state": { "userGoal": "implement auth", "complexity": "medium" }
}

// After Compaction
{
  "message": {
    "id": "msg-1",
    "content": "[COMPACTED: User auth implementation request]",
    "role": "user"
  },
  "state": {
    "userGoal": "implement auth",
    "complexity": "medium",
    "__compaction__": {
      "originalContent": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
      "tokensSaved": 45,
    }
  }
}

Structured Summarization (🔴 Tier 3)

Instead of free-form text summaries, the system creates structured summaries with specific fields:
StructuredSummary Interface
interface StructuredSummary {
  count: number;
  timeRange: { start?: Date; end?: Date };
  timestamp: Date;
}

{
  count: 1200,
  timeRange: {
    start: new Date('2024-01-15T10:00:00Z'),
    end: new Date('2024-01-15T14:30:00Z')
  },
  timestamp: new Date('2024-01-15T14:30:00Z')
}

Basic Setup

from agkit.agents.storage import InMemoryMemory, TDAIMemory

# Default context engineering (recommended)
memory = InMemoryMemory()

# Disable automatic context management
manual_memory = InMemoryMemory(
    enable_context_management=False  # Disable automatic context management
)

# Custom configuration with context engineering
memory = InMemoryMemory(
    enable_context_management=True,  # Enable automatic context management (default)
    thresholds={
        'pre_rot_threshold': 150_000,
        'compaction_trigger': 0.8,
        'summarization_trigger': 0.95,
        'keep_recent_count': 10,
    }
)

Manual Context Management

When enableContextManagement is disabled, you can manually trigger context management:
memory = InMemoryMemory(
    enable_context_management=False  # Disable automatic management
)

# Add events without automatic context management
await memory.add(event1, session_id='session-123')
await memory.add(event2, session_id='session-123')
await memory.add(event3, session_id='session-123')

# Manually trigger context management when needed
await memory.manage_context('session-123')

Custom Memory Implementation

To implement custom context engineering logic, extend BaseMemory and override the compaction and summarization methods:
from agkit.agents.storage import BaseMemory, IMemoryEvent, StructuredSummary
from datetime import datetime
from typing import List

# Method 1: Pass summarizer in config
async def custom_summarizer(events: List[IMemoryEvent]) -> StructuredSummary:
    # Custom summarization logic
    return {
        'count': len(events),
        'time_range': {
            'start': events[0]['message']['timestamp'],
            'end': events[-1]['message']['timestamp']
        },
        'timestamp': datetime.now()
    }

custom_memory = BaseMemory(
    summarizer=custom_summarizer,
    thresholds={
        'pre_rot_threshold': 100_000,
        'compaction_trigger': 0.8,
        'summarization_trigger': 0.95
    }
)

# Method 2: Extend BaseMemory for advanced customization
class CustomMemory(BaseMemory):
    def __init__(self, config=None):
        super().__init__({
            **(config or {}),
            'summarizer': self._custom_summarizer
        })

    # Override compaction logic for individual events
    async def _compact_event(self, event: IMemoryEvent) -> None:
        # Custom compaction logic - example: compress code blocks
        if '```' in event['message']['content']:
            event['state']['__compaction__'] = {
                'original_content': event['message']['content'],
                'compacted_at': datetime.now().isoformat(),
                'tokens_saved': /* calculate tokens saved */,
                'strategy': 'code_compression'
            }
            event['message']['content'] = '[COMPACTED: Code discussion]'
        else:
            await super()._compact_event(event)

    # Custom summarization function
    async def _custom_summarizer(self, events: List[IMemoryEvent]) -> StructuredSummary:
        return {
             'count': len(events),
              'time_range': {
                  'start': events[0]['message']['timestamp'],
                  'end': events[-1]['message']['timestamp']
              },
              'timestamp': datetime.now()
        }

    # Override storage methods (optional)
    async def _store_summary(self, session_id: str, summary: StructuredSummary) -> None:
        # Custom storage logic
        print(f'Storing summary for {session_id}')

    async def _clear_summarized_events(self, session_id: str, keep_recent_count: int) -> None:
        # Custom cleanup logic
        print(f'Cleaning up events for {session_id}')
        await super()._clear_summarized_events(session_id, keep_recent_count)

    # Trigger context management after adding events
    async def add(self, event: IMemoryEvent) -> None:
        await super().add(event)
        session_id = event['state'].get('session_id', 'default')
        await self._manage_context(session_id)

Key Implementation Points

  1. Context Management Control:
    • Automatic (default): Set enableContextManagement: true for automatic context management after each add
    • Manual: Set enableContextManagement: false and call manageContext(sessionId) manually
  2. Two Customization Approaches:
    • Simple: Pass summarizer function in config when creating BaseMemory
    • Advanced: Extend BaseMemory class for full customization
  3. Custom Summarizer: Provide config.summarizer function that takes events and returns StructuredSummary
  4. Override compactEvent(): Implement custom logic for reversible compaction of individual events
  5. Override Storage Methods (optional): Define storeSummary() and clearSummarizedEvents() for custom storage
  6. Preserve Metadata: Store compaction metadata in event.state.__compaction__
Automatic vs Manual: By default, both InMemoryMemory and TDAIMemory automatically trigger context management after adding events. Set enableContextManagement: false if you prefer to control when context management occurs, then call manageContext(sessionId) manually when needed.

Real-World Example: Long Debugging Session

Here’s how context engineering works in a real debugging session that spans 200+ messages:
# Initial problem (Normal Operation - Tier 1)
await memory.add({
    'message': {
        'role': 'user',
        'content': 'Our production database is timing out after 30 seconds, affecting 1200 users'
    },
    'state': {
        'issue': 'db_timeout',
        'severity': 'critical',
        'environment': 'production'
    }
})

# After 50 messages, context reaches 120K tokens
# 🟡 System automatically triggers Tier 2 (Compaction)
compacted_event = {
    'message': {'content': '[COMPACTED: Database timeout issue report]'},
    'state': {
        'issue': 'db_timeout',
        'severity': 'critical',
        '__compaction__': {
            'original_content': 'Our production database is timing out after 30 seconds...',
            'tokens_saved': 23,
            'compacted_at': '2024-01-15T10:30:00Z'
        }
    }
}

# After 200 messages, context reaches 142.5K tokens
# 🔴 System triggers Tier 3 (Structured Summarization)
summary = {
    'modified_files': ['database.py', 'user_service.py', 'auth_middleware.py'],
    'user_goal': 'Fix database connection timeout in production',
    'last_step': 'Updated connection pool configuration',
    'key_decisions': [
        'Increased pool size from 10 to 50',
        'Added connection retry logic',
        'Implemented circuit breaker pattern'
    ],
    'critical_context': {
        'error_pattern': 'Connection timeout after 30s',
        'environment': 'production',
        'affected_users': 1200,
        'urgency': 'critical'
    }
}
Result: The conversation can continue indefinitely while maintaining:
  • Performance: Response times remain fast
  • Context: Critical debugging information preserved
  • Cost: Token usage reduced by 70-80%
  • Quality: Recent context maintains conversation flow

Key Benefits

Context engineering solves the fundamental challenge of maintaining conversation quality as contexts grow:

🚀 Unlimited Conversation Length

  • No practical limits on conversation duration
  • Conversations can span hundreds or thousands of messages
  • System automatically manages context without manual intervention

📈 Maintained Performance

  • Response times remain fast even in long conversations
  • Quality doesn’t degrade as context fills up
  • Attention remains focused on relevant information

🧠 Information Preservation

  • Reversible compaction: Full recovery of compressed content
  • Structured summaries: Key insights preserved in organized format
  • Recent context: Always maintains conversation flow

💰 Cost Efficiency

  • Dramatic reduction in token usage (70-80% savings typical)
  • Lower API costs for long conversations
  • Efficient resource utilization

🔧 Zero Configuration

  • Works out of the box with sensible defaults
  • Automatic triggers based on configurable thresholds
  • Progressive strategy optimizes for information preservation
The key insight from Manus’s research: Not all context degradation is equal. By applying the right strategy at the right time (reversible compaction before irreversible summarization), we maintain conversation coherence while managing resource constraints effectively.