Skip to content

Prompt compression: reduce prompt size by 10-20% without quality regression #14

@apenab

Description

@apenab

Problem

Current prompts are 300+ lines long and include repeated guidance across sections (especially overlap between strategy instructions and critical rules). This increases token usage and parsing latency.

Proposal

Implement a prompt compression pass targeting a 10-20% reduction in prompt size while preserving behavior:

  • Merge redundant guidance between strategy and critical rules.
  • Simplify page-selection guidance (currently overly verbose).
  • Keep all non-negotiable constraints explicit and testable.

Expected Impact

  • Token reduction: ~15-20%
  • Lower prompt parsing latency in LLM inference.
  • Minimal behavior change if compressed carefully.

Risk

  • Low quality risk if constraints are preserved and validated.

Acceptance Criteria

  • Prompt token count reduced by at least 10% (target 15-20%).
  • No statistically significant regression on benchmark quality metrics.
  • Updated prompt docs include a mapping from old sections to compressed sections.
  • A/B comparison results are documented (before vs after token count, latency, quality).

Notes

Please include at least one automated or scriptable check to prevent prompt bloat regressions over time.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions