Skip to main content

The Context Degradation Problem

As AI conversations grow longer, they face a fundamental challenge: context degradation. This manifests as:
  • Performance Degradation: Token limits, attention dilution, processing latency, cost escalation
  • Information Overload: Signal vs noise, recency bias, context switching, memory fragmentation

The AG-Kit Solution

AG-Kit implements a three-tier context management strategy inspired by Manus’s research:
Context Size

    │                                    ┌─────────────┐
    │                                    │ Hard Limit  │ (1M tokens)
    │                                    └─────────────┘
    │                              ┌─────────────┐
    │                              │ Pre-rot     │ (150K tokens)
    │                              │ Threshold   │
    │                              └─────────────┘
    │                        ┌──────────┐
    │                        │ Summ.    │ (142.5K = 95%)
    │                        │ Trigger  │
    │                        └──────────┘
    │                  ┌──────────┐
    │                  │ Compact. │ (120K = 80%)
    │                  │ Trigger  │
    │                  └──────────┘
    │            ┌──────────┐
    │            │ Normal   │
    │            │ Operation│
    └────────────┴──────────┴──────────────────────────────────────>
                                                                Time

Three-Tier Strategy

  1. 🟢 Normal Operation (0-80%): Store all messages in full detail
  2. 🟡 Reversible Compaction (80-95%): Compress old messages while preserving reconstruction ability
  3. 🔴 Structured Summarization (95%+): Create structured summaries for dramatic token reduction

How It Works

Reversible Compaction (🟡 Tier 2)

Compaction stores the full original content in event.state.__compaction__ while replacing the message content with a compact reference:
// Original Event
{
  "message": {
    "id": "msg-1",
    "content": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
    "role": "user"
  },
  "state": { "userGoal": "implement auth", "complexity": "medium" }
}

// After Compaction
{
  "message": {
    "id": "msg-1",
    "content": "[COMPACTED: User auth implementation request]",
    "role": "user"
  },
  "state": {
    "userGoal": "implement auth",
    "complexity": "medium",
    "__compaction__": {
      "originalContent": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
      "tokensSaved": 45,
    }
  }
}

Structured Summarization (🔴 Tier 3)

Instead of free-form text summaries, the system creates structured summaries with specific fields:
StructuredSummary Interface
interface StructuredSummary {
  count: number;
  timeRange: { start?: Date; end?: Date };
  timestamp: Date;
}

{
  count: 1200,
  timeRange: {
    start: new Date('2024-01-15T10:00:00Z'),
    end: new Date('2024-01-15T14:30:00Z')
  },
  timestamp: new Date('2024-01-15T14:30:00Z')
}

Basic Setup

import { InMemoryMemory, TDAIMemory } from '@ag-kit/agents/storage';

// Default context engineering (recommended)
const memory = new InMemoryMemory();

// Disable automatic context management
const manualMemory = new InMemoryMemory({
  enableContextManagement: false  // Disable automatic context management
});

// Custom configuration with context engineering
const memoryWithConfig = new InMemoryMemory({
  enableContextManagement: true,   // Enable automatic context management (default)
  thresholds: {
    preRotThreshold: 150_000,    // Performance degradation point
    compactionTrigger: 0.8,      // Compact at 80% (120K tokens)
    summarizationTrigger: 0.95,   // Summarize at 95% (142.5K tokens)
    recentToKeep: 10,             // Always keep last 10 messages
  }
});

Manual Context Management

When enableContextManagement is disabled, you can manually trigger context management:
const memory = new InMemoryMemory({
  enableContextManagement: false  // Disable automatic management
});

// Add events without automatic context management
await memory.add(event1, { sessionId: 'session-123' });
await memory.add(event2, { sessionId: 'session-123' });
await memory.add(event3, { sessionId: 'session-123' });

// Manually trigger context management when needed
await memory.manageContext('session-123');

Custom Memory Implementation

To implement custom context engineering logic, extend BaseMemory and override the compaction and summarization methods:
import { BaseMemory, IMemoryEvent, StructuredSummary } from '@ag-kit/agents/storage';

// Method 1: Pass summarizer in config
const customMemory = new BaseMemory({
  summarizer: async (events: IMemoryEvent[]): Promise<StructuredSummary> => {
    // Custom summarization logic
    return {
      modifiedFiles: extractFiles(events),
      userGoal: extractUserGoal(events),
      lastStep: getLastStep(events),
      keyDecisions: extractDecisions(events),
      criticalContext: { /* custom context */ },
      timestamp: new Date()
    };
  },
  thresholds: {
    preRotThreshold: 100_000,
    compactionTrigger: 0.8,
    summarizationTrigger: 0.95
  }
});

// Method 2: Extend BaseMemory for advanced customization
class CustomMemory extends BaseMemory {
  constructor(config?: any) {
    super({
      ...config
    });
  }

  // Override compaction logic for individual events
  protected async compactEvent(event: IMemoryEvent): Promise<IMemoryEvent> {
    // Custom compaction logic - example: compress code blocks
    if (event.message.content.includes('```')) {
      event.state.__compaction__ = {
        originalContent: event.message.content,
        compactedAt: new Date().toISOString(),
        tokensSaved: /* calculate tokens saved */,
        strategy: 'code_compression'
      };
      event.message.content = '[COMPACTED: Code discussion]';
      return event
    } else {
      return super.compactEvent(event);
    }
  }

  // Custom summarization function
  private async summarizer(events: IMemoryEvent[]): Promise<StructuredSummary> {
    return {
      count: events.length,
      timeRange: {
        start: events[0].message.timestamp,
        end: events[events.length - 1].message.timestamp
      },
      timestamp: new Date()
    };
  }

  // Override storage methods (optional)
  protected async storeSummary(sessionId: string, summary: StructuredSummary): Promise<void> {
    // Custom storage logic
    console.log(`Storing summary for ${sessionId}`);
  }

  protected async clearSummarizedEvents(sessionId: string, recentToKeep: number): Promise<void> {
    // Custom cleanup logic
    console.log(`Cleaning up events for ${sessionId}`);
    await super.clearSummarizedEvents(sessionId, recentToKeep);
  }

  // Trigger context management after adding events
  async add(event: IMemoryEvent): Promise<void> {
    await super.add(event);
    const sessionId = event.state.sessionId || 'default';
    await this.manageContext(sessionId);
  }
}

Key Implementation Points

  1. Context Management Control:
    • Automatic (default): Set enableContextManagement: true for automatic context management after each add
    • Manual: Set enableContextManagement: false and call manageContext(sessionId) manually
  2. Two Customization Approaches:
    • Simple: Pass summarizer function in config when creating BaseMemory
    • Advanced: Extend BaseMemory class for full customization
  3. Custom Summarizer: Provide config.summarizer function that takes events and returns StructuredSummary
  4. Override compactEvent(): Implement custom logic for reversible compaction of individual events
  5. Override Storage Methods (optional): Define storeSummary() and clearSummarizedEvents() for custom storage
  6. Preserve Metadata: Store compaction metadata in event.state.__compaction__
Automatic vs Manual: By default, both InMemoryMemory and TDAIMemory automatically trigger context management after adding events. Set enableContextManagement: false if you prefer to control when context management occurs, then call manageContext(sessionId) manually when needed.

Real-World Example: Long Debugging Session

Here’s how context engineering works in a real debugging session that spans 200+ messages:
// Initial problem (Normal Operation - Tier 1)
await memory.add({
  message: {
    role: 'user',
    content: 'Our production database is timing out after 30 seconds, affecting 1200 users'
  },
  state: {
    issue: 'db_timeout',
    severity: 'critical',
    environment: 'production'
  }
});

// After 50 messages, context reaches 120K tokens
// 🟡 System automatically triggers Tier 2 (Compaction)
// Early messages get compacted but remain recoverable:

const compactedEvent = {
  message: { content: '[COMPACTED: Database timeout issue report]' },
  state: {
    issue: 'db_timeout',
    severity: 'critical',
    __compaction__: {
      originalContent: 'Our production database is timing out after 30 seconds...',
      tokensSaved: 23,
      compactedAt: '2024-01-15T10:30:00Z'
    }
  }
};

// After 200 messages, context reaches 142.5K tokens
// 🔴 System triggers Tier 3 (Structured Summarization)
const summary = {
  modifiedFiles: ['database.ts', 'user-service.ts', 'auth-middleware.ts'],
  userGoal: 'Fix database connection timeout in production',
  lastStep: 'Updated connection pool configuration',
  keyDecisions: [
    'Increased pool size from 10 to 50',
    'Added connection retry logic',
    'Implemented circuit breaker pattern'
  ],
  criticalContext: {
    errorPattern: 'Connection timeout after 30s',
    environment: 'production',
    affectedUsers: 1200,
    urgency: 'critical'
  }
};

// Recent messages (last 5) remain in full detail for context continuity
Result: The conversation can continue indefinitely while maintaining:
  • Performance: Response times remain fast
  • Context: Critical debugging information preserved
  • Cost: Token usage reduced by 70-80%
  • Quality: Recent context maintains conversation flow

Key Benefits

Context engineering solves the fundamental challenge of maintaining conversation quality as contexts grow:

🚀 Unlimited Conversation Length

  • No practical limits on conversation duration
  • Conversations can span hundreds or thousands of messages
  • System automatically manages context without manual intervention

📈 Maintained Performance

  • Response times remain fast even in long conversations
  • Quality doesn’t degrade as context fills up
  • Attention remains focused on relevant information

🧠 Information Preservation

  • Reversible compaction: Full recovery of compressed content
  • Structured summaries: Key insights preserved in organized format
  • Recent context: Always maintains conversation flow

💰 Cost Efficiency

  • Dramatic reduction in token usage (70-80% savings typical)
  • Lower API costs for long conversations
  • Efficient resource utilization

🔧 Zero Configuration

  • Works out of the box with sensible defaults
  • Automatic triggers based on configurable thresholds
  • Progressive strategy optimizes for information preservation
The key insight from Manus’s research: Not all context degradation is equal. By applying the right strategy at the right time (reversible compaction before irreversible summarization), we maintain conversation coherence while managing resource constraints effectively.