Skip to content

Add selective context compression for RAG generation #3

@colek42

Description

@colek42

Summary

Implement selective context compression to reduce token usage while preserving important information for generation.

Background

The Python REFRAG implementation uses a compress-select-expand pipeline: chunk passages into fixed-size segments, compute importance via query similarity, expand only top-p% chunks, and compress the rest with LLM summarization.

Reference: refrag_ollama.py:610-625 (REFRAGOllama.compress_and_select)

Features

Chunking

  • Split passages into k-token chunks (default: k=64)
  • Use GPT-2 tokenizer for chunking (lightweight, consistent)

Importance Scoring

  • Encode chunks with query context
  • Compute cosine similarity between chunk and query
  • Rank chunks by importance

Selective Expansion

  • Expand top p% of chunks (default: p=0.25, i.e., 25%)
  • Compress low-importance chunks with LLM (Claude Haiku or similar)
  • Build compressed context string

Performance

  • Reduces context from ~10k tokens to ~2k tokens
  • Preserves most important information
  • Enables longer generation with limited context windows

Implementation Tasks

  • Add tokenizer for chunking (use tiktoken or similar)
  • Implement chunk importance scoring
  • Add LLM compression for low-importance chunks
  • Integrate with HybridIndex API
  • Add compression metrics/logging

API Design

type CompressionOptions struct {
    ChunkSize      int     // Chunk size in tokens (default: 64)
    SelectionRatio float64 // Fraction of chunks to expand (default: 0.25)
    CompressModel  string  // LLM model for compression (e.g., "claude-haiku")
}

type HybridIndex struct {
    // ... existing fields ...
}

func (h *HybridIndex) CompressContext(results []*SearchResult, query string, opts *CompressionOptions) (string, error)

Benefits

  • 5x reduction in context tokens (10k → 2k)
  • Enables longer generation with limited context
  • Preserves query-relevant information
  • Proven effective in Python REFRAG

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions