Skip to content

[043][Phase 10][US8] Reporters (Console/JSON/MD/HTML) + LangSmith exporter #758

@jwesleye

Description

@jwesleye

Scope

US8 — five reporters covering dev-loop, CI, and publishing use cases. Terminal/console is plain-text only (per Q8 clarification — no ANSI, no interactivity); HTML is the sole interactive tier. JSON validates against a published schema. LangSmith is push-only.

Priority: P2

Tasks

Plain reporters + schema

  • T139 [P] [US8] Tests in eval/tests/reporter_console_test.rs — plain-text line-oriented output, no ANSI
  • T140 [US8] Reporter trait + ReporterOutput enum + ReporterError
  • T141 [P] [US8] ConsoleReporter (always-on, plain-text)
  • T142 [P] [US8] Tests for JsonReporter — schema validation
  • T143 [P] [US8] JsonReporter + author specs/043-evals-adv-features/contracts/eval-result.schema.json
  • T144 [P] [US8] Tests for MarkdownReporter — PR-comment-ready
  • T145 [P] [US8] MarkdownReporter

HTML reporter (feature html-report)

  • T146 [P] [US8] Tests for HTML reporter — self-contained file, <details>/<summary> collapsibility, bounded output size for thousand-case results
  • T147 [US8] HtmlReporter using askama templates with inlined CSS/JS

LangSmith (feature langsmith)

  • T148 [P] [US8] Tests with wiremock — run push + per-evaluator feedback attachment; partial failure surface
  • T149 [US8] LangSmithExporter + LangSmithExportError

Integration

  • T150 [US8] eval/tests/us8_end_to_end_test.rs — same EvalSetResult through each reporter; HTML validates as HTML5, JSON validates against schema, MD validates as CommonMark

Acceptance

  • Console output is plain-text only — no ANSI, no cursor control, no interactivity.
  • JSON output is self-contained and validates against the published schema.
  • Markdown output is valid CommonMark, PR-comment-ready.
  • HTML output is a single self-contained file with no external asset dependencies, interactivity via <details>/<summary> only (no mandatory JS).
  • LangSmith export surfaces partial-push failures structurally; no local partial state.

References

  • Spec FR-041, FR-042
  • Clarifications Q8 (rich console dropped)
  • Research R-015 (LangSmith), R-017 (HTML templating)

Depends on

#752 (US1 integration — needs EvalSetResult).

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestevalin-progressAutomated agent is working on thisspecSpec-driven implementation task

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions