Skip to main content

The Context Degradation Problem

As AI conversations grow longer, they face a fundamental challenge: context degradation. This manifests as:
  • Performance Degradation: Token limits, attention dilution, processing latency, cost escalation
  • Information Overload: Signal vs noise, recency bias, context switching, memory fragmentation

The AG-Kit Solution

AG-Kit implements a three-tier context management strategy inspired by Manus’s research:
Context Size

    │                                    ┌─────────────┐
    │                                    │ Hard Limit  │ (1M tokens)
    │                                    └─────────────┘
    │                              ┌─────────────┐
    │                              │ Pre-rot     │ (150K tokens)
    │                              │ Threshold   │
    │                              └─────────────┘
    │                        ┌──────────┐
    │                        │ Summ.    │ (142.5K = 95%)
    │                        │ Trigger  │
    │                        └──────────┘
    │                  ┌──────────┐
    │                  │ Compact. │ (120K = 80%)
    │                  │ Trigger  │
    │                  └──────────┘
    │            ┌──────────┐
    │            │ Normal   │
    │            │ Operation│
    └────────────┴──────────┴──────────────────────────────────────>
                                                                Time

Three-Tier Strategy

  1. 🟢 Normal Operation (0-80%): Store all messages in full detail
  2. 🟡 Reversible Compaction (80-95%): Compress old messages while preserving reconstruction ability
  3. 🔴 Structured Summarization (95%+): Create structured summaries for dramatic token reduction

How It Works

Reversible Compaction (🟡 Tier 2)

Compaction stores the full original content in event.state.__compaction__ while replacing the message content with a compact reference:
// Original Event
{
  "message": {
    "id": "msg-1",
    "content": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
    "role": "user"
  },
  "state": { "userGoal": "implement auth", "complexity": "medium" }
}

// After Compaction
{
  "message": {
    "id": "msg-1",
    "content": "[COMPACTED: User auth implementation request]",
    "role": "user"
  },
  "state": {
    "userGoal": "implement auth",
    "complexity": "medium",
    "__compaction__": {
      "originalContent": "I need help implementing a user authentication system with JWT tokens, password hashing, and role-based access control...",
      "tokensSaved": 45,
    }
  }
}

Structured Summarization (🔴 Tier 3)

Instead of free-form text summaries, the system creates structured summaries with specific fields:
StructuredSummary Interface
interface StructuredSummary {
  count: number;
  timeRange: { start?: Date; end?: Date };
  timestamp: Date;
}

{
  count: 1200,
  timeRange: {
    start: new Date('2024-01-15T10:00:00Z'),
    end: new Date('2024-01-15T14:30:00Z')
  },
  timestamp: new Date('2024-01-15T14:30:00Z')
}

Basic Setup

Manual Context Management

When enableContextManagement is disabled, you can manually trigger context management:

Custom Memory Implementation

To implement custom context engineering logic, extend BaseMemory and override the compaction and summarization methods:

Key Implementation Points

  1. Context Management Control:
    • Automatic (default): Set enableContextManagement: true for automatic context management after each add
    • Manual: Set enableContextManagement: false and call manageContext(sessionId) manually
  2. Two Customization Approaches:
    • Simple: Pass summarizer function in config when creating BaseMemory
    • Advanced: Extend BaseMemory class for full customization
  3. Custom Summarizer: Provide config.summarizer function that takes events and returns StructuredSummary
  4. Override compactEvent(): Implement custom logic for reversible compaction of individual events
  5. Override Storage Methods (optional): Define storeSummary() and clearSummarizedEvents() for custom storage
  6. Preserve Metadata: Store compaction metadata in event.state.__compaction__
Automatic vs Manual: By default, both InMemoryMemory and TDAIMemory automatically trigger context management after adding events. Set enableContextManagement: false if you prefer to control when context management occurs, then call manageContext(sessionId) manually when needed.

Real-World Example: Long Debugging Session

Here’s how context engineering works in a real debugging session that spans 200+ messages: Result: The conversation can continue indefinitely while maintaining:
  • Performance: Response times remain fast
  • Context: Critical debugging information preserved
  • Cost: Token usage reduced by 70-80%
  • Quality: Recent context maintains conversation flow

Key Benefits

Context engineering solves the fundamental challenge of maintaining conversation quality as contexts grow:

🚀 Unlimited Conversation Length

  • No practical limits on conversation duration
  • Conversations can span hundreds or thousands of messages
  • System automatically manages context without manual intervention

📈 Maintained Performance

  • Response times remain fast even in long conversations
  • Quality doesn’t degrade as context fills up
  • Attention remains focused on relevant information

🧠 Information Preservation

  • Reversible compaction: Full recovery of compressed content
  • Structured summaries: Key insights preserved in organized format
  • Recent context: Always maintains conversation flow

💰 Cost Efficiency

  • Dramatic reduction in token usage (70-80% savings typical)
  • Lower API costs for long conversations
  • Efficient resource utilization

🔧 Zero Configuration

  • Works out of the box with sensible defaults
  • Automatic triggers based on configurable thresholds
  • Progressive strategy optimizes for information preservation
The key insight from Manus’s research: Not all context degradation is equal. By applying the right strategy at the right time (reversible compaction before irreversible summarization), we maintain conversation coherence while managing resource constraints effectively.