observability: per-stage token and cost breakdown in budget_summary

## Context

A 2026-04-19 smoke run produced this `budget_summary.per_agent` in the session JSON:

```json
"per_agent": {
  "apprentice_pipeline": {
    "tokens_used": 3987,
    "cost_usd": 0.059805,
    "calls": 1,
    "duration_seconds": 47.27
  }
}
```

All six pipeline stages (discovery, implementation, instrumentation, visualization, assessment, review) are aggregated into a single row labeled `apprentice_pipeline`.

## Problem

When a generated artifact fails quality review, there is no way to diagnose which stage is responsible without re-running with manual instrumentation. Per-stage breakdown is also needed to:

- Tune individual prompt costs
- Detect when one stage dominates the budget
- Compare model choices stage-by-stage (e.g., Haiku for assessment, gpt-5.4 for implementation)
- Validate that gates actually ran

## Proposed fix

`budget_summary.per_agent` should report one row per stage defined in `src/apprentice/stages/`:

```json
"per_agent": {
  "discovery":       {"tokens_used": N, "cost_usd": N, "calls": N, "duration_seconds": N},
  "implementation":  {...},
  "instrumentation": {...},
  "visualization":   {...},
  "assessment":      {...},
  "review":          {...}
}
```

Additionally, record per-stage gate verdicts (`gate_name → pass/fail/skipped`) in the session JSON so a reader can confirm which gates executed and in what order — the current session has no evidence of gate ordering.

## Related

Observability is the "no magic" principle applied to apprentice itself. Currently apprentice is opaque about its own pipeline — ironic given the project it serves.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

observability: per-stage token and cost breakdown in budget_summary #13

Context

Problem

Proposed fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

observability: per-stage token and cost breakdown in budget_summary #13

Description

Context

Problem

Proposed fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions