research(compression): probe-based evaluation of context compaction quality

## Source

Factory.ai (2025): Evaluating Context Compression for AI Agents
https://factory.ai/news/evaluating-compression

## Summary

Replace opaque compression quality metrics (ROUGE/embedding similarity) with functional probes run after each compaction:
- **Recall probes**: did specific facts survive?
- **Artifact probes**: does the agent know which files/tools it used?
- **Continuation probes**: can it pick up mid-task?
- **Decision probes**: are past reasoning traces intact?

Agent's ability to correctly answer these probes is the quality signal.

## Applicability to Zeph

**Relevance: HIGH.** Zeph's summarization quality is currently opaque. Zeph already has a compaction probe (`[memory.compression.probe]`), but the current probe uses generic LLM-generated questions. Structured probe categories (recall/artifact/continuation/decision) would surface silent information loss more reliably.

## Implementation sketch

- Extend `CompactionProbe` to generate probes per category (currently generates generic questions)
- After compaction, run each category with different prompt templates
- Score by category; log per-category breakdown in debug dump
- Expose per-category scores in TUI metrics panel (issue #448)

## Complexity: LOW-MEDIUM

Probe prompts are simple; main work is categorizing probe generation and updating the scoring logic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(compression): probe-based evaluation of context compaction quality #2164

Source

Summary

Applicability to Zeph

Implementation sketch

Complexity: LOW-MEDIUM

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(compression): probe-based evaluation of context compaction quality #2164

Description

Source

Summary

Applicability to Zeph

Implementation sketch

Complexity: LOW-MEDIUM

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions