Problem
Evidence extraction currently pulls near-full page chunks (~2000-4000 chars), which inflates token usage.
Proposal
Compress evidence extraction to include only the relevant section plus minimal surrounding context (~800-1200 chars), while preserving traceable grounding.
Expected Impact
- Estimated token reduction: ~15-20%
- Example target: 2.2M -> 1.8M tokens (~-18%)
Risk
- Moderate: missing key context may hurt grounding if extraction is too aggressive.
Implementation Considerations
- Keep extraction deterministic and auditable (store source span references).
- Include configurable context window around matched evidence.
- Add fallback to larger context when confidence is low.
Acceptance Criteria
- Evidence payload size is reduced to target range for most samples.
- Grounding metrics do not regress beyond agreed tolerance.
- Benchmark report includes before/after token use and quality.
- Failure analysis documents any context-loss errors.
Problem
Evidence extraction currently pulls near-full page chunks (~2000-4000 chars), which inflates token usage.
Proposal
Compress evidence extraction to include only the relevant section plus minimal surrounding context (~800-1200 chars), while preserving traceable grounding.
Expected Impact
Risk
Implementation Considerations
Acceptance Criteria