feat: 15 - verbose mode by joefrost01 · Pull Request #17 · neural-chilli/dtoo

joefrost01 · 2026-04-11T19:17:37Z

What problem are you trying to solve?

--verbose existed as a flag but did not emit structured progress logs across pipeline stages. Users could not observe stage timing, per-file outcomes, or row-count progression without adding ad-hoc prints.

What does this PR change?

This adds a timestamped verbose logger in the query pipeline and logs key execution milestones to stderr (start, file resolution, schema mode, ref loads, per-file success/skip, accumulation, post-sql, masking, lineage, limit, validation, count, output/profile/fingerprint, done). It also adds thousands-separator row formatting and tests for number formatting.

Does this change align with DESIGN.md?

Yes. Output data paths remain unchanged, and verbose logs stay on stderr so stdout output remains pipe-safe. Core execution order and semantics are preserved.

What alternatives did you consider?

I considered introducing a separate logging module and dependency-heavy progress UI first, but that added complexity before behavior was validated. A focused in-pipeline logger with clear timestamp formatting delivered the spec behavior with minimal risk.

Does this PR contain multiple unrelated changes?

No. All changes are scoped to verbose mode behavior for feature 15.

Existing PRs

I have reviewed all open AND closed PRs for duplicates or prior art
Related PRs: none found

Testing

cargo test passes
cargo clippy passes with no warnings
cargo fmt has been run
New tests added:
- format_count_inserts_thousands_separators

Evaluation

What was the specific scenario you tested?
- End-to-end query pipeline tests (including dry-run paths) with verbose logging enabled in code path
- Row-count formatting behavior for 0, small, and thousands-sized values
What was the output before and after the change?
- Before: minimal/no structured verbose progress output.
- After: timestamped stderr progress logs across major pipeline stages with formatted counts.
Did you test error cases (bad input, missing files, invalid SQL)?
- Yes, existing suite still covers those paths and now runs with the verbose instrumentation present.

Human review

A human has reviewed the COMPLETE proposed diff before submission

joefrost01

Review of feat: 15 - verbose mode

Overall this is solid work — the VerboseLogger struct is clean, stderr-only output is correct, timestamp format matches spec, format_count is well-tested, and stdout data output is untouched. However, there's one issue that needs fixing before merge:

1. "Accumulation complete" is logged in the wrong position (must fix)

File: src/query_pipeline.rs:274-278

The "Accumulation complete" log appears after post-sql, masking, lineage, and limit have all been applied. Per the spec, it should appear immediately after the per-file processing loop and before post-sql.

The spec example makes this clear — the counts are different:

[00:12.4] Accumulation complete: 48,721 rows    ← pre-transformation total
[00:12.5] Applying post-sql...
[00:12.8] Post-sql complete: 1,204 rows          ← post-transformation total

Currently the code queries SELECT COUNT(*) FROM temp_results after limit/post-sql/etc. have already mutated the table, so "Accumulation complete" would report e.g. 1,204 instead of 48,721. This is semantically misleading.

Fix: Move the row_count query + "Accumulation complete" log to immediately after the file processing loop (after the processed == 0 check, before the post_sql block). Keep the existing later row_count query for validation/output where it is.

2. Hyphens instead of em-dashes (minor)

The spec consistently uses em-dashes (—) in log messages:

Loading ref table: regions (ref/regions.parquet) — 250 rows
[1/47] data/2024-01/sales.csv — 1,423 rows matched

The implementation uses regular hyphens (-). See the format! calls in the ref-table loop (~line 163) and per-file success/skip logging (~lines 209, 224).

3. Missing "Exclude" log point (minor, can defer)

The spec lists Excluded {n} files (only if excludes matched) but no such log exists. If the exclude count isn't readily available from resolve_files, this could be a follow-up.

Everything else checks out: pipe safety, dry-run early exit before logging, per-file [i/total] indexing, conditional logging for masking/lineage/limit/validation, and the "Done" summary. Nice format_count implementation with tests.

Not LGTM yet — issue #1 needs to be addressed. After that fix (and the optional em-dash cleanup), this is ready to merge pending CI.

joefrost01

Review: `feat/15-verbose-mode` vs `specs/15-verbose-mode.md`

Overall this is solid work — the VerboseLogger struct, timestamp format, format_count, and pipeline integration are all clean. A few spec deviations need fixing before merge:

1. Em dash vs hyphen in log messages

The spec uses — (em dash) as the separator in several log messages. The implementation uses - (hyphen) instead.

Spec:

[00:00.1] Loading ref table: regions (ref/regions.parquet) — 250 rows
[00:00.2] [1/47] data/2024-01/sales.csv — 1,423 rows matched
[00:00.4] [3/47] data/2024-02/sales.csv — SKIPPED (...)

Actual (in diff):

"Loading ref table: {} ({}) - {} rows"
"[{}/{}] {} - {} rows matched"
"[{}/{}] {} - SKIPPED ({})"

These should use — (em dash) to match the spec format.

2. Missing "Exclude" log point

The spec requires:

Excluded {n} files (only if excludes matched)

This log point is absent from the implementation. FileResolver::resolve() currently consumes the exclude count internally, so surfacing it may require returning metadata from the resolver. If this is deferred, it should be tracked as a follow-up.

3. `indicatif` dependency

The spec lists indicatif = "0.17" under Dependencies. It isn't added in this PR. The spec notes it's for future progress bar use, so this may be intentional — but worth confirming whether it should be added now per the spec or deferred.

Everything else looks correct: timestamp format, all other log points present, thousands-separator formatting, pipe safety (stderr only), closure refactor to capture row counts, and the format_count test. Nice work on keeping the behavioral changes minimal.

Verdict: Not ready to merge. The em dash formatting is a clear spec deviation that should be fixed. The exclude log point and indicatif dependency should be addressed or explicitly deferred.

joefrost01

Re-review: All prior findings resolved ✓

Verified the latest commit (74bf7c2) addresses all three issues from the prior review:

Em-dash format — Log messages now use — (em dash) matching the spec (query_pipeline.rs:167,218,231)
Excluded files log — "Excluded {n} files" is now emitted when excludes match (query_pipeline.rs:131)
indicatif dependency — indicatif = "0.17" added to Cargo.toml

LGTM — ready to merge once CI passes.

feat: 15 - verbose pipeline progress logging

c842237

joefrost01 commented Apr 11, 2026

View reviewed changes

fix: align verbose logs with spec and report excluded files

74bf7c2

joefrost01 commented Apr 11, 2026

View reviewed changes

joefrost01 merged commit 5ed8782 into main Apr 11, 2026
6 checks passed

joefrost01 deleted the feat/15-verbose-mode branch April 11, 2026 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 15 - verbose mode#17

feat: 15 - verbose mode#17
joefrost01 merged 2 commits into
mainfrom
feat/15-verbose-mode

joefrost01 commented Apr 11, 2026

Uh oh!

joefrost01 left a comment

Uh oh!

joefrost01 left a comment

Uh oh!

joefrost01 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joefrost01 commented Apr 11, 2026

What problem are you trying to solve?

What does this PR change?

Does this change align with DESIGN.md?

What alternatives did you consider?

Does this PR contain multiple unrelated changes?

Existing PRs

Testing

Evaluation

Human review

Uh oh!

joefrost01 left a comment

Choose a reason for hiding this comment

Review of feat: 15 - verbose mode

1. "Accumulation complete" is logged in the wrong position (must fix)

2. Hyphens instead of em-dashes (minor)

3. Missing "Exclude" log point (minor, can defer)

Uh oh!

joefrost01 left a comment

Choose a reason for hiding this comment

Review: feat/15-verbose-mode vs specs/15-verbose-mode.md

1. Em dash vs hyphen in log messages

2. Missing "Exclude" log point

3. indicatif dependency

Uh oh!

joefrost01 left a comment

Choose a reason for hiding this comment

Re-review: All prior findings resolved ✓

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Review: `feat/15-verbose-mode` vs `specs/15-verbose-mode.md`

3. `indicatif` dependency