feat: 16 - manifest by joefrost01 · Pull Request #18 · neural-chilli/dtoo

joefrost01 · 2026-04-11T20:11:44Z

What problem are you trying to solve?

There was no manifest sidecar output for query runs, which made orchestration/audit workflows harder because run metadata (inputs, file outcomes, timing, output rows/fingerprint, and command settings) was not persisted in a machine-readable artifact.

What does this PR change?

This adds ManifestWriter (src/manifest.rs) and integrates manifest emission as the final query step when --manifest is provided. The pipeline now captures run/file/output metadata, writes YAML manifests on success and on runtime failures (including --expect-at-least failures), creates parent directories as needed, and warns (without failing the run) if manifest writing fails.

Does this change align with DESIGN.md?

Yes. Query execution order remains intact; manifest writing is supplementary and occurs after execution completes/fails. Core stdout/stderr behavior and exit-code semantics are unchanged.

What alternatives did you consider?

I considered writing manifest data inline directly from query_pipeline without a dedicated module, but that scattered serialization concerns and made testing harder. A dedicated manifest module keeps schema/serialization concerns isolated while pipeline code focuses on data collection.

Does this PR contain multiple unrelated changes?

No. All edits are directly related to feature 16 manifest support and required metadata plumbing.

Existing PRs

I have reviewed all open AND closed PRs for duplicates or prior art
Related PRs: none found

Testing

cargo test passes
cargo clippy passes with no warnings
cargo fmt has been run
New tests added:
- pipeline_writes_manifest_on_success
- pipeline_writes_manifest_on_expect_at_least_failure

Evaluation

What was the specific scenario you tested?
- Successful query with --manifest writes manifest YAML with batch/command/files/output/timing fields.
- Query failing --expect-at-least still writes manifest containing failure metadata.
What was the output before and after the change?
- Before: no manifest sidecar was emitted.
- After: manifest YAML is written to --manifest path (creating parent directories), with warnings-only behavior on manifest write errors.
Did you test error cases (bad input, missing files, invalid SQL)?
- Yes. Existing suite still covers those paths, and new tests add explicit failure-case manifest coverage.

Human review

A human has reviewed the COMPLETE proposed diff before submission

joefrost01

Review: Feature 16 — Manifest Sidecar

Reviewed against specs/16-manifest.md, CLAUDE.md, and DESIGN.md.

Spec Compliance

Requirement	Status
YAML structure matches spec	✓ All fields present, correct nesting
`batch_id` / `batch_hash` from lineage manager	✓
`dtoo_version` from `env!("CARGO_PKG_VERSION")`	✓
`command.*` from CLI args	✓
`files.*` from pipeline result (total/processed/skipped/details)	✓
`output.rows` / `output.fingerprint`	✓ (`None` when `--fingerprint` not used)
`timing.*` from pipeline start/end	✓
Written as final pipeline step (step 17)	✓
Write-on-failure (`--expect-at-least` etc.)	✓ — closure pattern ensures manifest written after errors
Parent directory creation	✓ — `create_dir_all` in `ManifestWriter::write`
Warning-only on manifest write errors	✓ — `eprintln!` warning, does not affect exit code
`batch_id`/`batch_hash` without `--lineage`	✓ — `LineageManager::new(None, ...)` still generates both

Architecture

Clean separation: manifest.rs owns struct definitions and serialization; query_pipeline.rs collects metadata and calls write_manifest_if_requested after the closure returns. The error: Option<String> field (with skip_serializing_if) is a sensible extension for the failure-recording use case.

Minor suggestions (non-blocking)

batch_hash early-failure fallback (query_pipeline.rs:126): The pre-closure fallback uses Uuid::new_v4() for batch_hash. This only matters if the pipeline fails before LineageManager::new (e.g., DuckDB init failure). A deterministic hash would be more semantically correct — could compute it from args upfront — but this is a narrow edge case.
Doc comments on public API (manifest.rs): Per CLAUDE.md, public structs/functions should have doc comments. The lineage.rs additions have them; manifest.rs items do not.
Duplicate format helpers: manifest::output_format_label duplicates query_pipeline::output_format_to_str. Could reuse the existing one.

Tests

Two tests cover the key paths:

pipeline_writes_manifest_on_success — verifies manifest written with expected fields
pipeline_writes_manifest_on_expect_at_least_failure — verifies manifest written with error info on failure

LGTM. Ready to merge once CI passes.

joefrost01 · 2026-04-11T20:16:11Z

Manifest Review — 3 Focus Areas

1. Schema Completeness ✅

The implementation covers all fields from the DESIGN.md example schema and adds reasonable extras (dtoo_version, top-level error, output.format, expanded command fields). These are sensible additions for auditability. No fields from the spec are missing.

2. Write-on-Failure Behavior — Bug 🐛

The write_manifest_if_requested call at line 409 fires after the inner closure regardless of success/failure — good design. The --expect-at-least failure path and PartialFailure paths work correctly because they occur after lines 274-275 set summary.files_processed and summary.files_skipped.

However, the --on-error fail file-error path (line 253-254) returns Err(err) from the closure before reaching lines 274-275. This means the manifest is written with:

files:
  total: 5          # correct (set at line 140)
  processed: 0      # wrong — defaults, never updated
  skipped: 0        # wrong — defaults, never updated
  details:          # has real entries pushed at lines 238-243 before the early return
    - path: "file1.csv"
      rows_matched: 100
      status: ok

The processed/skipped summary counts are inconsistent with the details array. Fix: either update summary.files_processed/summary.files_skipped incrementally inside the loop (not after it), or set them from file_details.len() in write_manifest_if_requested.

This path also lacks test coverage — only --expect-at-least failure has a manifest test. A test with --on-error fail + a corrupt file + --manifest would catch this.

3. Warning-Only Manifest Write Failures ✅

Lines 729-734 catch ManifestWriter::write errors, print to stderr, and do not propagate — correctly preserving the pipeline's original exit code. Clean implementation.

Minor

ManifestWriter is a unit struct with a single static method — could be a free function (like build_command already is). Not blocking, just a consistency note.

TL;DR: One real bug — files.processed/files.skipped are wrong in the manifest when --on-error fail triggers a file-level error. The fix is straightforward (update summary counts incrementally). Everything else looks solid.

feat: 16 - write YAML manifest sidecar

588d198

joefrost01 commented Apr 11, 2026

View reviewed changes

joefrost01 merged commit b78c9bf into main Apr 11, 2026
6 checks passed

joefrost01 deleted the feat/16-manifest branch April 11, 2026 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 16 - manifest#18

feat: 16 - manifest#18
joefrost01 merged 1 commit into
mainfrom
feat/16-manifest

joefrost01 commented Apr 11, 2026

Uh oh!

joefrost01 left a comment

Uh oh!

joefrost01 commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joefrost01 commented Apr 11, 2026

What problem are you trying to solve?

What does this PR change?

Does this change align with DESIGN.md?

What alternatives did you consider?

Does this PR contain multiple unrelated changes?

Existing PRs

Testing

Evaluation

Human review

Uh oh!

joefrost01 left a comment

Choose a reason for hiding this comment

Review: Feature 16 — Manifest Sidecar

Spec Compliance

Architecture

Minor suggestions (non-blocking)

Tests

Uh oh!

joefrost01 commented Apr 11, 2026

Manifest Review — 3 Focus Areas

1. Schema Completeness ✅

2. Write-on-Failure Behavior — Bug 🐛

3. Warning-Only Manifest Write Failures ✅

Minor

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant