Isolate DAG orchestration logs from job logs#144
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0d3f7244b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if not m: | ||
| continue | ||
| job_name, ts_str = m.group(1), m.group(2) | ||
| # Skip DAG orchestration logs only when the -dag suffix is not itself a | ||
| # real job name — a job named e.g. "standup-dag" should not be excluded. | ||
| if job_name.endswith("-dag") and job_name not in known_jobs: | ||
| continue | ||
| if name and job_name != name: |
There was a problem hiding this comment.
Preserve legacy DAG-log exclusion in cost aggregation
This loop now aggregates every top-level *-<ts>-<pid>.log under ~/.clauck, so users who upgrade with pre-change orchestration logs like <root>-dag-...log will have those legacy DAG runs counted as normal jobs (e.g., pipe-dag) in clauck cost. Before this commit those files were explicitly excluded unless -dag was a real job name, so this is a correctness regression for historical spend totals on existing installations.
Useful? React with 👍 / 👎.
| pass | ||
|
|
||
| # Glob all log files, sorted newest first by mtime | ||
| logs = sorted(JOBS_DIR.glob("*-[0-9]*.log"), key=lambda p: p.stat().st_mtime, reverse=True) |
There was a problem hiding this comment.
Keep legacy DAG logs out of history listings
cmd_history now scans all top-level logs in JOBS_DIR without filtering out legacy orchestration files (<root>-dag-<ts>-<pid>.log), so upgraded users can see old DAG coordinator runs reported as regular job invocations. This reintroduces ambiguous/misleading history entries for pre-migration data even though the command previously had explicit exclusion logic for these files.
Useful? React with 👍 / 👎.
Non-technical summary
This fixes a log-accounting bug where DAG orchestration runs could be mistaken for real job runs when a user had a job whose name ended in
-dag. DAG runner logs now live in their own hidden directory, soclauck history,clauck cost,clauck logs --follow, andclauck peekstop guessing from ambiguous filenames and report the right job data.Why this matters now: issue #121 describes a user-visible correctness problem in spend/history reporting, and the current code was still relying on filename heuristics to paper over the collision.
The system is better afterward because DAG coordination logs are now structurally separated from per-job execution logs instead of being filtered by inference.
Technical summary
DAG_LOGS_DIR = ~/.clauck/.dag-logsin bothlib/clauckandlib/dag-runner.py.DagLoggerto write orchestration logs into that dedicated directory with<root>-<ts>-<pid>.lognames.-dagexclusion logic fromcmd_costandcmd_history; those commands now read only real job logs fromJOBS_DIR.clauck logs --follow,_active_dag_log(),clauck peek, and the doctor diagnostic guidance to use the dedicated DAG log directory.*-dagjob-name case plus dedicated DAG-log-dir behavior in:tests/test_clauck_history_cost.pytests/test_clauck_logs.pytests/test_dag_runner.pyRelevant intent: this preserves inspectability while making the log surface unambiguous, which is a better fit for clauck’s local-runtime contract than trying to infer log type from colliding filenames.
Tests:
python3 -m unittest -v tests.test_dag_runner tests.test_clauck_logs tests.test_clauck_history_costpython3 -m unittest discover testsBreaking changes: none intended. Existing job logs stay where they are. New DAG orchestration logs will be written under
~/.clauck/.dag-logs/instead of the top-level~/.clauck/directory.Additional notes
Trade-off: this changes the on-disk location for new DAG orchestration logs, but it removes ambiguity across all log-consuming surfaces instead of adding another special-case parser rule.
Deferred follow-up: no migration is included for old DAG logs already written in the root job directory. The fix is intentionally forward-only to keep the increment small and low-risk.
Remaining gap: any historical top-level DAG orchestration logs created before this change remain wherever they were originally written, but new runs stop producing ambiguous filenames.