Problem
Current prompts are 300+ lines long and include repeated guidance across sections (especially overlap between strategy instructions and critical rules). This increases token usage and parsing latency.
Proposal
Implement a prompt compression pass targeting a 10-20% reduction in prompt size while preserving behavior:
- Merge redundant guidance between strategy and critical rules.
- Simplify page-selection guidance (currently overly verbose).
- Keep all non-negotiable constraints explicit and testable.
Expected Impact
- Token reduction: ~15-20%
- Lower prompt parsing latency in LLM inference.
- Minimal behavior change if compressed carefully.
Risk
- Low quality risk if constraints are preserved and validated.
Acceptance Criteria
- Prompt token count reduced by at least 10% (target 15-20%).
- No statistically significant regression on benchmark quality metrics.
- Updated prompt docs include a mapping from old sections to compressed sections.
- A/B comparison results are documented (before vs after token count, latency, quality).
Notes
Please include at least one automated or scriptable check to prevent prompt bloat regressions over time.
Problem
Current prompts are 300+ lines long and include repeated guidance across sections (especially overlap between strategy instructions and critical rules). This increases token usage and parsing latency.
Proposal
Implement a prompt compression pass targeting a 10-20% reduction in prompt size while preserving behavior:
Expected Impact
Risk
Acceptance Criteria
Notes
Please include at least one automated or scriptable check to prevent prompt bloat regressions over time.