Skip to content

Enhance clauck observability: utility costs, history, and intent signal routing #117

@CoreyRDean

Description

@CoreyRDean

Summary

Expand clauck's observability and telemetry infrastructure to track utility-level costs and invocation history, and enhance clauck report to intelligently separate and durably record different signal types (cost metrics, novel user stories, failure patterns, configuration drift) rather than just routing everything to GitHub issues.

Motivation

Clauck itself (interpreter and executor sessions) consumes API costs similar to jobs, but there's no way to track or aggregate those costs. Additionally, users need a central view of:

  • All historical commands invoked (with full original text)
  • Cost per invocation method (work, doctor, add --when, etc.)
  • Aggregate cost, failure rates, and average per-run metrics

More critically, as users interact with clauck, there are valuable signals beyond GitHub-issue-worthy bugs: novel user stories, unexpected interaction patterns, configuration quirks, and the delta between a user's mental model of how clauck should work and how it actually works. These signals are essential for prioritizing features and improvements, but currently get lost because clauck report is oriented toward GitHub issues.

Acceptance Criteria

Phase 1: Utility Costs & History

  • clauck cost expanded to include utility-level aggregates (interpreter/executor session costs) alongside job costs
  • Cost breakdown by invocation method: work, doctor, add, list, etc.
  • Per-invocation metrics: total runs, failures, last-run, average cost per run
  • clauck history shows chronological list of all utility invocations with full command text
  • Both views support filtering and aggregation (by method, date range, etc.)

Phase 2: Durably-Recorded User Stories & Signals

  • Create stories/ folder in repo to durably record discovered user stories and usage patterns
  • Format for recorded stories: structured metadata (date discovered, usage pattern, version, invocation method, signal type) + narrative summary
  • clauck report enhanced to intelligently separate signal types (GitHub-issue-worthy bugs, novel user stories, failure patterns, configuration insights) and route to appropriate durability layers
  • Signals already shared by user are preserved immediately ("imperfect info is better than no info")

Phase 3: Intent Delta Analysis

  • clauck report (or a new subcommand) can run a post-session analysis exploring user's local clauck state, history, and intent
  • Analysis identifies and documents the delta between user's mental model (estimated from signals) and actual system behavior
  • Output: structured intent-delta report with:
    • Observed usage patterns and configuration
    • Inferred user expectations
    • Actual system behavior
    • Signals of novel use cases, frustration, or misalignment
    • Framed neutrally (not opinionated about what should change)
  • User can opt-in to sharing specific signals with project (interactive intent mining for edge cases)

Phase 4: Signal Routing & Durability

  • clauck report intelligently routes signals to appropriate sinks:
    • GitHub issues: bugs, features, regressions
    • stories/: novel user stories, usage patterns
    • Cost ledger: interpreter/executor cost metrics
    • Failure log: recurring error signatures
  • Durable storage is used immediately on information availability (don't wait for user confirmation)
  • User consent gates sharing signals externally, not recording them locally

Design Notes

  • Cost tracking should capture: API model, tokens (prompt + completion), cost per invocation, grouped by utility type and invocation method.
  • History should include enough context to replay user intent: full command, start time, duration, exit code, output summary.
  • User stories folder should use a consistent schema (front-matter + narrative) and be indexed for trend analysis.
  • Intent delta analysis should acknowledge uncertainty: true user intent is unknowable, intent matrix is hidden, delta vector is estimated.
  • Priority: imperfect-and-immediate-recording over perfect-and-delayed. Opt-in sharing, not opt-in recording.

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions