Skip to content

feat: add reusable thread lifecycle pipeline#359

Merged
rianjs merged 32 commits into
mainfrom
feat/thread-lifecycle-architecture
Jun 27, 2026
Merged

feat: add reusable thread lifecycle pipeline#359
rianjs merged 32 commits into
mainfrom
feat/thread-lifecycle-architecture

Conversation

@rianjs

@rianjs rianjs commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Decomposes inline discussion handling into reusable thread lifecycle building blocks for both full cr review and the new cr respond command.

  • adds shared thread context normalization, resolved-thread summary compaction, and CR-authored thread detection
  • adds shared thread analysis and planned-action projection so replies/resolutions flow through reviewplan, ledger, and outbox
  • adds shared stage model resolution and LLM-run persistence contracts with architecture docs and guardrails
  • wires full review to analyze inline thread lifecycle before reviewer agents and feed compact context into prompts
  • adds cr respond for reusable thread response planning and retryable posting semantics
  • preserves safe JSON unknown-field names in structured-output retry prompts, which the local dry run needed to recover from reviewer schema validation

Validation

  • rtk go test ./internal/threadrespond ./internal/cmd/reviewcmd
  • rtk go test ./internal/pipeline
  • rtk go test ./internal/cmd/...
  • rtk go test ./internal/llm
  • rtk go test ./... (2438 passed in 49 packages)
  • rtk go build ./cmd/cr
  • local build dry run: /tmp/cr-thread-lifecycle review https://github.com/open-cli-collective/codereview-cli/pull/359 --dry-run --max-agents 3 --allow-self-review --json --verbose

Dry-run Result

Run d09716b5-8129-4668-89f0-f8675ce768b4 completed with outcome: dry_run against head 5da53b22aaec.

  • selected reviewers: structure:repo-health, go:implementation-tests, policies:conventions
  • reviewer count respected --max-agents 3
  • planned actions stayed planned_only: 5 inline comments, 1 rollup comment, 1 submit-review action
  • thread lifecycle context: 0 discussion threads considered, 0 summarized, 0 resolved for this PR
  • wall time: 12m16s; estimated cost: ~$1.99

Notes

This PR intentionally keeps provider writes on the existing durable reviewplan/ledger/outbox path.

Closes #326
Closes #369
Fixes #155

@rianjs rianjs changed the title [codex] add reusable thread lifecycle pipeline feat: add reusable thread lifecycle pipeline Jun 23, 2026
@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 70ee83c to bbb4e15 Compare June 26, 2026 10:43
@rianjs

rianjs commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Findings

  • Major: the new dry-run rerun coverage proves --rerun gets a fresh RunID, but it never asserts a fresh artifact root. TestDryRunRerunBypassesIncompleteRunAttempt would still pass if the implementation reused the interrupted run’s artifact tree while allocating run-fresh, which is the exact regression the ticket calls out. See internal/pipeline/pipeline_test.go.
  • Minor: there is no CLI-level smoke test for cr review --dry-run --rerun; the only --rerun command assertion is on the live path. That leaves the flag plumbing for the dry-run recovery path unproven at the wrapper layer. See internal/cmd/reviewcmd/reviewcmd_test.go.
  • Minor: cr respond is only exercised through a fake runtime/responder in the command tests, so a wiring regression between the command surface and the new threadrespond pipeline could still slip through even if the lower-level package tests stay green. See internal/cmd/reviewcmd/reviewcmd_test.go.

Comment thread docs/architecture.md Outdated
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.

Reply inline to this comment.

if err != nil {
return preparedSelectionContext{}, err
}
if strings.TrimSpace(req.PostingIdentity.ID) != "" || strings.TrimSpace(req.PostingIdentity.Login) != "" {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionOnly now silently drops normalized thread context whenever PostingIdentity is unset, even though SelectionRequest grew that field specifically to drive the new thread-lifecycle prompt input. That means existing callers can keep compiling while benchmarking or other selection-only flows quietly stop exercising the same selector inputs as cr review, so the changed behavior is no longer proven by the same code path. Make this contract explicit instead of degrading silently: either require PostingIdentity for selection-only runs that load inline threads, or add a focused test that covers the no-identity branch and update callers to pass the resolved posting identity so selection-only continues to match full-review behavior.

Reply inline to this comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in later branch revisions: selection-only now has the posting identity contract covered so thread lifecycle context does not silently disappear from selector inputs.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The later branch revisions cover the posting-identity contract for selection-only runs, so inline thread lifecycle context no longer drops out of selector inputs silently.

Summary:
Selection-only now enforces or covers the posting-identity contract so normalized inline thread context remains present in selector inputs, keeping selection-only behavior aligned with full review flows.

Comment thread docs/architecture.md Outdated
New LLM-backed components must not call provider adapters or `internal/llm`
structured helpers directly. They should call `llmlifecycle` through explicit,
fakeable dependencies in unit tests and should return domain results rather
than posting comments or mutating provider state. This guardrail is enforced by

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.

Reply inline to this comment.

Comment thread internal/cmd/reviewcmd/reviewcmd.go Outdated
cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo’s command layout is organized as one top-level command package per internal/cmd/* directory, but this change introduces a new top-level respond command by folding it into reviewcmd (newRespondCommand, runRespond, ResponseRunner, extra rendering helpers, and tests) instead of giving it its own internal/cmd/respondcmd boundary. That makes the command tree less discoverable from the package map in docs/development.md, and it couples future respond-specific flags/runtime/output work to the already large reviewcmd package, which is the kind of ownership drift that gets harder to unwind after more command-specific behavior lands. Split respond into its own command package and keep only shared review/respond bootstrap helpers in a common seam so the repo’s internal/cmd/* structure continues to reflect the user-facing command surface.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 994639f56055
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 1
policies:conventions 0
documentation:docs 2
go:implementation-tests (1 finding)

Minor - internal/pipeline/pipeline.go:1052

SelectionOnly now silently drops normalized thread context whenever PostingIdentity is unset, even though SelectionRequest grew that field specifically to drive the new thread-lifecycle prompt input. That means existing callers can keep compiling while benchmarking or other selection-only flows quietly stop exercising the same selector inputs as cr review, so the changed behavior is no longer proven by the same code path. Make this contract explicit instead of degrading silently: either require PostingIdentity for selection-only runs that load inline threads, or add a focused test that covers the no-identity branch and update callers to pass the resolved posting identity so selection-only continues to match full-review behavior.

structure:repo-health (1 finding)

Major - internal/cmd/reviewcmd/reviewcmd.go:189

The repo’s command layout is organized as one top-level command package per internal/cmd/* directory, but this change introduces a new top-level respond command by folding it into reviewcmd (newRespondCommand, runRespond, ResponseRunner, extra rendering helpers, and tests) instead of giving it its own internal/cmd/respondcmd boundary. That makes the command tree less discoverable from the package map in docs/development.md, and it couples future respond-specific flags/runtime/output work to the already large reviewcmd package, which is the kind of ownership drift that gets harder to unwind after more command-specific behavior lands. Split respond into its own command package and keep only shared review/respond bootstrap helpers in a common seam so the repo’s internal/cmd/* structure continues to reflect the user-facing command surface.

documentation:docs (2 findings)

Minor - docs/architecture.md:16

The durable-path example is off by one important detail: task directories are created under an encoded task ID (llm-tasks/<encoded-task-id>/metadata.json), not the raw <task> string. The implementation uses path-component encoding before writing metadata, and docs/llm-task-artifacts.md already documents the encoded form. Update this line to use the encoded-task wording so maintainers do not construct or inspect the wrong path shape.

Minor - docs/architecture.md:25

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. That test only fails on direct calls to llm.RunStructured*; it does not detect new components calling provider adapters directly. Reword the sentence to say the test enforces the internal/llm structured-helper boundary, or add that direct-adapter prohibition is a convention not fully covered by the guardrail test.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/approvaloverride/approvaloverride.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/review/review.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver_test.go Review scope was intentionally narrowed to the core thread-lifecycle, pipeline, ledger, model-resolution, and command-wiring changes.; Static review only; I did not execute the Go test suite in this read-only workbench.
structure:repo-health incomplete_skipped docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go internal/gateio/gateio.go, internal/llmlifecycle/lifecycle_test.go, internal/pipeline/pipeline_test.go Review scoped to the assigned files, with emphasis on long-term command/package boundaries and durable lifecycle contracts rather than exhaustive behavioral validation.
policies:conventions complete_broad docs/development.md, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go unavailable Shared source-of-truth convenience copies were not present at ../cli-common/docs or ../.github, so this review was limited to repo-local guidance and the assigned changed files.
documentation:docs complete_broad docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md unavailable Review scope was limited to the assigned documentation files, with repo/code reads only to verify factual claims and referenced paths/commands.; Workspace was read-only, so I could not execute the documented workflows end-to-end; findings are based on source inspection.

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 6m 13s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 6m 13s wall · 12m 17s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 24s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 5m 13s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 20s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 33s
documentation:docs gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 38s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 6s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

@rianjs-bot

rianjs-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 3ec58bdf8fc2
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 1
policies:conventions 1
documentation:docs 2
go:implementation-tests (1 finding)

Major - internal/pipeline/pipeline.go:924

findIncompleteDryRun now resumes any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full review runs from the new threadrespond dry-runs. threadrespond explicitly writes and checks a thread-response-run.json marker before resuming its own runs, so an interrupted cr respond --dry-run can be picked up by the next cr review --dry-run here, reusing the wrong artifact tree and run ID and mixing response-only state into the review pipeline. Filter review resumes to review-owned runs as well, for example by adding a complementary review marker or at least rejecting artifact roots that contain the response marker, and add a regression test that a pending response dry-run is not resumed by pipeline.DryRun.

structure:repo-health (1 finding)

Major - internal/pipeline/pipeline.go:30

SelectionRequest now carries PostingIdentity so selector-only runs can normalize thread lifecycle context, but validateSelectionOnly still treats that field as optional while prepareSelectionContext silently skips normalization when it is empty at lines 1052-1057. Existing callers such as benchmark select still construct SelectionRequest without a posting identity, so they now lose discussion context instead of failing, which makes selector benchmarks diverge from real cr review behavior and hides the contract change from future callers. Make PostingIdentity required for SelectionOnly, plumb it through every caller, and add a regression test that empty identities are rejected rather than dropping thread context.

policies:conventions (1 finding)

Minor - internal/cmd/reviewcmd/reviewcmd.go:189

This adds a new top-level cr respond command from inside reviewcmd, which drifts from the repo’s visible command layout convention: cmd/cr/main.go registers one internal/cmd/*cmd package per top-level command (configcmd, datacmd, reviewcmd, etc.). Keeping respond nested in reviewcmd makes the package name stop matching the command surface and sets an awkward precedent for future top-level commands. The smallest policy-aligned fix is to move the respond Cobra wiring into a new internal/cmd/respondcmd package and register it alongside the other top-level command packages, while leaving shared runtime helpers in a lower-level shared package if needed.

documentation:docs (2 findings)

Minor - docs/architecture.md:16

The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).

Minor - docs/architecture.md:25

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/threadrespond/threadrespond.go internal/approvaloverride/approvaloverride.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/review/review.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go Filesystem is read-only in this environment, so I inspected code and tests but did not run the Go test suite.; Scope intentionally limited to the assigned Go implementation files and nearby tests needed to validate behavior.
structure:repo-health complete_broad docs/development.md, internal/cmd/reviewcmd/reviewcmd.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review scoped to the assigned files and structural risks; I did not audit the rest of the PR.
policies:conventions complete_broad docs/development.md, internal/approvaloverride/approvaloverride.go, internal/cmd/configcmd/configcmd.go, internal/cmd/datacmd/datacmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/reviewplan/summary.go unavailable Reviewed only the assigned files.; Shared CLI source-of-truth docs were not present in ../cli-common/docs, so findings are limited to conventions visible in this repo and review context.
documentation:docs complete_broad docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md unavailable Review limited to the three assigned documentation files and local repository context; no network-based verification of external source-of-truth links.
unassigned incomplete_unassigned unavailable internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go, internal/ledger/ledger_test.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 4m 25s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions, documentation:docs
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 4m 25s wall · 11m 52s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 19s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 19s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 58s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 48s
documentation:docs gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 07s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 18s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

@@ -30,13 +30,17 @@ import (
"github.com/open-cli-collective/codereview-cli/internal/gitprovider"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: internal/pipeline/pipeline.go

SelectionRequest now carries PostingIdentity so selector-only runs can normalize thread lifecycle context, but validateSelectionOnly still treats that field as optional while prepareSelectionContext silently skips normalization when it is empty at lines 1052-1057. Existing callers such as benchmark select still construct SelectionRequest without a posting identity, so they now lose discussion context instead of failing, which makes selector benchmarks diverge from real cr review behavior and hides the contract change from future callers. Make PostingIdentity required for SelectionOnly, plumb it through every caller, and add a regression test that empty identities are rejected rather than dropping thread context.

Reply inline to this comment.

Comment thread docs/architecture.md Outdated
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The durable-path example is inaccurate: task directories are not keyed by the raw task ID, they are created under llm-tasks/<encoded-task-id>/... via statepaths.Encode/llmlifecycle.encodePathComponent. Readers using this path to inspect artifacts will look in the wrong directory for task IDs containing :, /, or other encoded characters. Update the example to say llm-tasks/<encoded-task-id>/metadata.json (or explicitly call out that the task ID is percent-encoded in the directory name).

Reply inline to this comment.

Comment thread internal/cmd/reviewcmd/reviewcmd.go Outdated
cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a new top-level cr respond command from inside reviewcmd, which drifts from the repo’s visible command layout convention: cmd/cr/main.go registers one internal/cmd/*cmd package per top-level command (configcmd, datacmd, reviewcmd, etc.). Keeping respond nested in reviewcmd makes the package name stop matching the command surface and sets an awkward precedent for future top-level commands. The smallest policy-aligned fix is to move the respond Cobra wiring into a new internal/cmd/respondcmd package and register it alongside the other top-level command packages, while leaving shared runtime helpers in a lower-level shared package if needed.

Reply inline to this comment.

Comment thread docs/architecture.md Outdated
New LLM-backed components must not call provider adapters or `internal/llm`
structured helpers directly. They should call `llmlifecycle` through explicit,
fakeable dependencies in unit tests and should return domain results rather
than posting comments or mutating provider state. This guardrail is enforced by

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overstates what internal/architecture/llm_lifecycle_test.go actually enforces. The test only fails direct internal/llm structured-helper calls outside the allowed packages; it does not detect new components calling provider adapters directly. As written, the doc claims a broader guardrail than the repo has. Either narrow the sentence to the internal/llm helper restriction, or add a separate architecture test that also bans direct provider-adapter invocation outside the approved boundary.

Reply inline to this comment.

postingIdentity := postingKey(req.PostingIdentity)
var best ledger.Run
found := false
for _, run := range runs {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findIncompleteDryRun now resumes any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full review runs from the new threadrespond dry-runs. threadrespond explicitly writes and checks a thread-response-run.json marker before resuming its own runs, so an interrupted cr respond --dry-run can be picked up by the next cr review --dry-run here, reusing the wrong artifact tree and run ID and mixing response-only state into the review pipeline. Filter review resumes to review-owned runs as well, for example by adding a complementary review marker or at least rejecting artifact roots that contain the response marker, and add a regression test that a pending response dry-run is not resumed by pipeline.DryRun.

Reply inline to this comment.

@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 3ec58bd to 7488454 Compare June 26, 2026 15:46
@rianjs rianjs force-pushed the feat/thread-lifecycle-architecture branch from 7488454 to 34d7a6d Compare June 26, 2026 15:49

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread README.md
`host + namespace` routes, omitted `repos` means all repos in that namespace,
and no match fails with an actionable error asking for `--profile` or a route.
Host matching is case-insensitive after normalization, while namespace and repo
matching are case-sensitive after trimming whitespace. An explicit `--profile`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: README.md

This config example still documents the pre-refactor profile shape (profiles.<name>.reviewer_credentials plus inline llm:), but the current schema now persists reusable repository_access, reviewer_entities, and llm_runtimes, with profiles pointing at them via repository_access, reviewer, and llm_runtime (internal/config/config.go). That means a reader who copies this template will end up with config that does not match what cr now saves, and the inline llm: block is actively rejected by config.Load's known-fields parsing (internal/config/config.go:687-690, internal/config/config_test.go:130-152). Rewrite this section to show the current saved YAML shape instead of the legacy reviewer_credentials/inline-llm model.

Reply inline to this comment.

Comment thread README.md
`host + namespace` routes, omitted `repos` means all repos in that namespace,
and no match fails with an actionable error asking for `--profile` or a route.
Host matching is case-insensitive after normalization, while namespace and repo
matching are case-sensitive after trimming whitespace. An explicit `--profile`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: README.md

The README never documents the new cr respond command even though this PR adds it (internal/cmd/reviewcmd/reviewcmd.go:190-214) and the new architecture/artifact docs already refer to it. As written, the primary CLI reference leaves readers without the basic invocation, dry-run/live semantics, --rerun, or the important difference that cr respond --retry-posts takes a response run ID rather than acting like cr review --retry-posts (internal/cmd/reviewcmd/reviewcmd.go:430-487). Add a cr respond workflow example and a command-reference section so maintainers are not blocked or misled when trying to use the new feature.

Reply inline to this comment.

Comment thread internal/pipeline/pipeline.go Outdated
if err != nil {
return ledger.Run{}, false, err
}
runs, err := store.ListRuns(ctx)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete dry-run reuse needs a durable run-kind boundary before it can be considered safe. findIncompleteDryRun currently resumes any incomplete dry_run row that matches PR/head/base/profile/posting identity, but the new cr respond flow now creates rows in that same ledger scope and distinguishes them only with its artifact marker (threadrespond.findIncompleteRun checks thread-response-run.json). If a response-only run is interrupted, a later cr review --dry-run can adopt that run ID and artifact directory, then append review findings/actions into a response run and reuse the wrong task artifacts. That mixes two different durable contracts in one run. Fix this by giving review runs the same kind of explicit discriminator before resume (for example a review-run marker or a durable run_kind column), switching the lookup to the scoped ListRunsForHeadScope query, and adding a regression test for respond --dry-run interrupted before completion followed by review --dry-run.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 54c74b080d3a
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 2
structure:repo-health 2
policies:conventions 1
go:implementation-tests (2 findings)

Major - internal/pipeline/pipeline.go:924

findIncompleteDryRun will resume any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full-review dry-runs from the new threadrespond dry-runs. threadrespond allocates runs in the same durable namespace and only protects itself with thread-response-run.json; review dry-runs never check for a review-specific marker or run kind before reusing an artifact root. That means an interrupted cr respond --dry-run can be picked up by cr review --dry-run, mixing response-only state and review planning in the same run. Fix by persisting a review run-kind discriminator (marker file or ledger field) and require it during review resume, with a test that a response dry-run is not resumable from pipeline.DryRun.

Minor - internal/pipeline/pipeline.go:30

SelectionOnly no longer enforces the posting identity that the new thread-lifecycle selector logic depends on. prepareSelectionContext only calls threadcontext.Normalize when PostingIdentity is non-empty, so a selection-only caller without an identity silently drops normalized thread context and ThreadActions. The benchmark selection path currently invokes SelectionOnly without setting PostingIdentity, so it is no longer exercising the same selector inputs as real review runs. Fix by requiring SelectionRequest.PostingIdentity here (mirroring full Request validation) or by plumbing an explicit fallback identity through selection-only callers, and add a focused test for the empty-identity case.

structure:repo-health (2 findings)

Major - internal/pipeline/pipeline.go:645

Dry-run resume is keyed off reviewCtx.pr here, not the resolved review premises. When a caller pins --review-base-sha/--review-head-sha, findIncompleteDryRun will still match against the live PR head/base instead of the pinned review SHAs, so a rerun can resume the wrong artifact tree or miss the correct pinned run entirely. That breaks the durable identity of a pinned review. Pass reviewCtx.reviewPR (or the resolved base/head explicitly) into resume lookup and add a regression test that pinned dry runs only resume runs created with the same pinned SHAs.

Minor - internal/architecture/model_resolution_test.go:55

The new architecture guardrail only catches direct calls when the imported package identifier is literally config. Importing internal/config under an alias (for example cfg) bypasses this test, so the repo can drift away from stagemodel.ResolveStageModel without CI noticing. Reuse the import-alias detection pattern from llm_lifecycle_test.go so any alias of the config package is blocked.

policies:conventions (1 finding)

Minor - internal/cmd/reviewcmd/reviewcmd.go:190

This adds a second top-level root command from reviewcmd, which drifts from the repo-local command layout where each top-level cr surface lives in its own internal/cmd/*cmd package (configcmd, datacmd, mecmd, sessionscmd, etc.). Keeping respond registered here also leaves its CLI contract and growing test surface coupled to the review command package even though docs/development.md still describes command glue as internal/cmd/*. The smallest policy-aligned fix is to move respond into its own internal/cmd/respondcmd package and have root wiring register that package separately, while sharing any common runtime helpers below the command layer.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/approvaloverride/approvaloverride.go, internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox.go, internal/outbox/outbox_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review.go, internal/review/review_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go I did not inspect every assigned file; uninspected files are listed in skipped_files rather than inferred as safe.; Review scope was intentionally narrowed to the changed lifecycle/resume/thread-response paths (pipeline, threadrespond, and the reviewcmd wiring/tests) because those were the highest-risk Go implementation seams called out in the dossier.
structure:repo-health incomplete_skipped docs/architecture.md, docs/development.md, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/reviewplan/reviewplan.go, internal/stagemodel/resolver.go internal/cmd/reviewcmd/reviewcmd_test.go, internal/llmlifecycle/lifecycle_test.go, internal/reviewplan/reviewplan_test.go, internal/stagemodel/resolver_test.go Read-only sandbox; I inspected code and tests statically and did not run the test suite.; Structural review only; I focused on durable command/pipeline boundaries, resume identity, and guardrail enforceability rather than exhaustive behavior.
policies:conventions complete_broad docs/architecture.md, docs/development.md, docs/init-ux-contract.md, docs/llm-task-artifacts.md, internal/cmd/configcmd/configcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gitprovider/github/review_authority.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go unavailable Shared cli-common and .github source-of-truth docs were not present in the local review context, so findings are limited to repo-local conventions visible in this checkout.

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 7m 12s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 7m 12s wall · 11m 30s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 31s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 58s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 5m 38s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 06s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 16s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

postingIdentity := postingKey(req.PostingIdentity)
var best ledger.Run
found := false
for _, run := range runs {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findIncompleteDryRun will resume any incomplete dry-run that matches PR/head/base/profile/posting identity, but it does not distinguish full-review dry-runs from the new threadrespond dry-runs. threadrespond allocates runs in the same durable namespace and only protects itself with thread-response-run.json; review dry-runs never check for a review-specific marker or run kind before reusing an artifact root. That means an interrupted cr respond --dry-run can be picked up by cr review --dry-run, mixing response-only state and review planning in the same run. Fix by persisting a review run-kind discriminator (marker file or ledger field) and require it during review resume, with a test that a response dry-run is not resumable from pipeline.DryRun.

Reply inline to this comment.

if !ok || selector.Sel.Name != "ResolveModelTier" {
return true
}
ident, ok := selector.X.(*ast.Ident)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new architecture guardrail only catches direct calls when the imported package identifier is literally config. Importing internal/config under an alias (for example cfg) bypasses this test, so the repo can drift away from stagemodel.ResolveStageModel without CI noticing. Reuse the import-alias detection pattern from llm_lifecycle_test.go so any alias of the config package is blocked.

Reply inline to this comment.

Comment thread internal/cmd/reviewcmd/reviewcmd.go Outdated
cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a second top-level root command from reviewcmd, which drifts from the repo-local command layout where each top-level cr surface lives in its own internal/cmd/*cmd package (configcmd, datacmd, mecmd, sessionscmd, etc.). Keeping respond registered here also leaves its CLI contract and growing test surface coupled to the review command package even though docs/development.md still describes command glue as internal/cmd/*. The smallest policy-aligned fix is to move respond into its own internal/cmd/respondcmd package and have root wiring register that package separately, while sharing any common runtime helpers below the command layer.

Reply inline to this comment.

@@ -30,13 +30,17 @@ import (
"github.com/open-cli-collective/codereview-cli/internal/gitprovider"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: internal/pipeline/pipeline.go

SelectionOnly no longer enforces the posting identity that the new thread-lifecycle selector logic depends on. prepareSelectionContext only calls threadcontext.Normalize when PostingIdentity is non-empty, so a selection-only caller without an identity silently drops normalized thread context and ThreadActions. The benchmark selection path currently invokes SelectionOnly without setting PostingIdentity, so it is no longer exercising the same selector inputs as real review runs. Fix by requiring SelectionRequest.PostingIdentity here (mirroring full Request validation) or by plumbing an explicit fallback identity through selection-only callers, and add a focused test for the empty-identity case.

Reply inline to this comment.

Comment thread internal/pipeline/pipeline.go Outdated
}
var resumedDryRun *ledger.Run
if !mode.live && !req.Rerun {
run, ok, err := findIncompleteDryRun(ctx, opts.Store, req, reviewCtx.pr)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dry-run resume is keyed off reviewCtx.pr here, not the resolved review premises. When a caller pins --review-base-sha/--review-head-sha, findIncompleteDryRun will still match against the live PR head/base instead of the pinned review SHAs, so a rerun can resume the wrong artifact tree or miss the correct pinned run entirely. That breaks the durable identity of a pinned review. Pass reviewCtx.reviewPR (or the resolved base/head explicitly) into resume lookup and add a regression test that pinned dry runs only resume runs created with the same pinned SHAs.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 7e1a92290384
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 2
structure:repo-health 2
policies:conventions 0
go:implementation-tests (2 findings)

Major - internal/outbox/outbox.go:123

When BundleInlineOnSubmit is enabled, Post marks every pending inline action as posted as soon as it sees a posted submit-review action (reconcilePostedBundledInline/markBundledInlinePosted), without verifying that those inline comments actually exist in thread state. That means a resumed/conflict-reconciled run can silently drop required inline findings if the review body was posted but the bundled comments were missing, rejected, or simply not the same set of pending inline actions anymore. The new tests codify this by treating a matching review body alone as sufficient. Reconcile bundled inline actions against the actual thread comments/markers before marking them posted, or restrict this shortcut to the same-process successful submit path where you already know which inline IDs were included.

Major - internal/cmd/reviewcmd/runtime_provider.go:73

runtimeProvider.Capabilities now ORs the read-provider and write-provider capability flags, but these flags drive posting/planning behavior (ThreadResolution, NativeFileLevelComments, BundleInlineOnSubmit) and must reflect the posting credential, not whichever side happens to advertise more support. With split credentials, a read-only provider that says BundleInlineOnSubmit=true or ThreadResolution=true can make the live path plan writes the posting provider cannot actually perform, and the current tests only verify call routing, not a read/write capability mismatch. Return write-side capabilities here (or model read-vs-write caps separately if you need both later), and add a test with divergent fake capabilities so the planner/outbox path follows the write provider.

structure:repo-health (2 findings)

Major - internal/threadrespond/threadrespond.go:442

threadrespond is supposed to be the reusable response-side lifecycle entrypoint, but this block couples it back to internal/pipeline for artifact layout (pipeline.ArtifactPaths*) and introduces a second private run-classification protocol (thread-response-run.json) with its own resume scan. That means review and respond now have two independent marker/resume implementations to keep in sync, which is exactly the kind of drift that will compound as retention, resume, or artifact layout changes land. Extract the artifact-path + run-kind marker/read/validate helpers into a neutral run-artifacts package (or move them under a shared statepaths/ledger-owned seam) and have both pipeline and threadrespond use the same implementation instead of copying the protocol.

Minor - docs/architecture.md:60

This PR documents Git Provider Writes and Inline Thread Lifecycle as stable architecture boundaries, but unlike the LLM lifecycle and stage-model sections there is no corresponding enforceable guardrail in the diff. Right now those two invariants live only in prose, so future commands can quietly reintroduce direct provider writes or bypass threadcontext/threadanalysis without any test catching it. Add architecture tests similar to internal/architecture/llm_lifecycle_test.go and internal/architecture/model_resolution_test.go that restrict provider-write calls to the posting path and verify thread-lifecycle packages stay on the intended domain/lifecycle seams.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle.go, internal/outbox/outbox.go, internal/outbox/outbox_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/threadanalysis/threadanalysis.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/approvaloverride/approvaloverride.go, internal/approvaloverride/approvaloverride_test.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review.go, internal/review/review_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go Review scope was limited to the assigned Go implementation and behavioral tests; I did not inspect unrelated command/docs changes.; Workspace was read-only, so findings are based on static inspection only and I could not add or run targeted repro tests.; assigned files were neither inspected nor skipped: internal/llmlifecycle/lifecycle_test.go, internal/threadanalysis/threadanalysis_test.go
structure:repo-health complete_broad docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review scoped to the assigned files and long-term structural risks; I did not inspect unrelated packages such as internal/gateio, internal/outbox, or data-lifecycle code beyond references needed to validate the new boundaries.
policies:conventions complete_broad docs/architecture.md, docs/development.md, docs/init-ux-contract.md, docs/llm-task-artifacts.md, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/reviewplan/summary.go unavailable Review limited to the assigned files plus repo-local guidance visible in this worktree.; The canonical shared cli-common and .github source-of-truth repositories were referenced by local guidance, but no local convenience copy was present and network access was unavailable, so shared-standard checks were limited to repo-local conventions and visible context.
unassigned incomplete_unassigned unavailable internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/benchmarkcmd/select.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 4m 54s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 4m 54s wall · 9m 45s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 26s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 39s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 33s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 53s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 13s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread docs/architecture.md
resolver implementation itself. The architecture guardrail is enforced by
`internal/architecture/model_resolution_test.go`.

## Git Provider Writes

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR documents Git Provider Writes and Inline Thread Lifecycle as stable architecture boundaries, but unlike the LLM lifecycle and stage-model sections there is no corresponding enforceable guardrail in the diff. Right now those two invariants live only in prose, so future commands can quietly reintroduce direct provider writes or bypass threadcontext/threadanalysis without any test catching it. Add architecture tests similar to internal/architecture/llm_lifecycle_test.go and internal/architecture/model_resolution_test.go that restrict provider-write calls to the posting path and verify thread-lifecycle packages stay on the intended domain/lifecycle seams.

Reply inline to this comment.

Comment thread internal/outbox/outbox.go Outdated
return Result{ExitCode: exitFailed}, err
}
actions = sortActions(actions)
if opts.Provider.Capabilities().BundleInlineOnSubmit {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When BundleInlineOnSubmit is enabled, Post marks every pending inline action as posted as soon as it sees a posted submit-review action (reconcilePostedBundledInline/markBundledInlinePosted), without verifying that those inline comments actually exist in thread state. That means a resumed/conflict-reconciled run can silently drop required inline findings if the review body was posted but the bundled comments were missing, rejected, or simply not the same set of pending inline actions anymore. The new tests codify this by treating a matching review body alone as sufficient. Reconcile bundled inline actions against the actual thread comments/markers before marking them posted, or restrict this shortcut to the same-process successful submit path where you already know which inline IDs were included.

Reply inline to this comment.

return p.write.SubmitReview(ctx, ref, r)
}

func (p runtimeProvider) Capabilities() gitprovider.ProviderCaps {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtimeProvider.Capabilities now ORs the read-provider and write-provider capability flags, but these flags drive posting/planning behavior (ThreadResolution, NativeFileLevelComments, BundleInlineOnSubmit) and must reflect the posting credential, not whichever side happens to advertise more support. With split credentials, a read-only provider that says BundleInlineOnSubmit=true or ThreadResolution=true can make the live path plan writes the posting provider cannot actually perform, and the current tests only verify call routing, not a read/write capability mismatch. Return write-side capabilities here (or model read-vs-write caps separately if you need both later), and add a test with divergent fake capabilities so the planner/outbox path follows the write provider.

Reply inline to this comment.

Comment thread internal/threadrespond/threadrespond.go Outdated
return threadanalysis.ResponseActions(analyses)
}

func allocateOrResumeRun(ctx context.Context, opts Options, req Request, pr gitprovider.PR, prKey string, mode ledger.PostMode) (ledger.Run, pipeline.ArtifactPaths, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

threadrespond is supposed to be the reusable response-side lifecycle entrypoint, but this block couples it back to internal/pipeline for artifact layout (pipeline.ArtifactPaths*) and introduces a second private run-classification protocol (thread-response-run.json) with its own resume scan. That means review and respond now have two independent marker/resume implementations to keep in sync, which is exactly the kind of drift that will compound as retention, resume, or artifact layout changes land. Extract the artifact-path + run-kind marker/read/validate helpers into a neutral run-artifacts package (or move them under a shared statepaths/ledger-owned seam) and have both pipeline and threadrespond use the same implementation instead of copying the protocol.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 82bb27b84777
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 2
policies:conventions 2
go:implementation-tests (1 finding)

Major - internal/pipeline/pipeline.go:786

InsertPlanningResult now persists findings and planned actions before writeArtifacts and, for dry-runs, before CompleteRun. If either later step fails, the deferred outcome leaves the run incomplete, so the next dry-run resumes the same run in findIncompleteDryRun and reaches this insert again with the same finding/action IDs. That turns a resumable run into a duplicate-row failure instead of finishing the already-planned work. The current resume tests only cover failure before planning persistence (for example a blocking rollup task), so this regression would pass. Fix by making resumed runs detect and reuse already-persisted planning rows, or by moving the resumable boundary so a run is not reused after planning has been durably inserted; add a focused test that forces failure after InsertPlanningResult and verifies the resumed dry-run completes without duplicate insert errors.

structure:repo-health (2 findings)

Major - internal/runartifact/runartifact.go:198

The new run-kind marker is treated as a durable resume boundary, but HasMarker only checks that a file with the expected name exists. Both cr review and cr respond now trust that boolean when selecting incomplete runs, so a stale/copied artifact tree or wrong-kind marker file can be misclassified as resumable state without validating schema_version, kind, or run_id. That weakens the new artifact contract at exactly the boundary the docs say must fail closed. Replace the existence probe with a real marker reader/validator, and have resume code verify the decoded marker matches the expected kind and the ledger run ID before reusing the artifact root.

Minor - internal/cmd/reviewcmd/reviewcmd.go:190

cr respond is introduced as a second top-level command from inside reviewcmd, even though the repo’s command surface is otherwise discoverable by one package per top-level command family under internal/cmd/*. Leaving a separate user-facing command embedded here makes ownership, tests, and future docs harder to find, and it invites more respond-specific runtime and flag handling to accumulate in the review package. Move the command registration and CLI glue into a dedicated internal/cmd/respondcmd package, then share the common runtime wiring through a smaller helper package if needed.

policies:conventions (2 findings)

Major - docs/architecture.md:62

This section states provider writes have one durable path and that commands should not mutate provider state directly, but the new architecture guardrail test does not actually enforce that repo-wide rule: internal/architecture/thread_lifecycle_test.go explicitly exempts all of internal/cmd/reviewcmd from the write-method scan. That means future direct PostIssueComment/ReplyToThread/SubmitReview calls added in command glue would still pass the advertised guardrail. Narrow the allowlist to the specific boundary shim/outbox code, or soften this prose so it reads as an intended design boundary rather than an enforced convention.

Minor - docs/development.md:13

The repo-local project overview still summarizes the current CLI surface without mentioning the new cr respond command, even though this PR adds it as a durable root command and the rest of the docs now refer to it. Update the overview so the development guide reflects the public command surface future contributors are expected to preserve.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests complete_broad internal/cmd/reviewcmd/reviewcmd.go, internal/ledger/ledger.go, internal/llm/adapter.go, internal/llmlifecycle/lifecycle.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review scope was limited to the assigned Go implementation/test files and nearby repo-local tests/docs needed to verify changed behavior.
structure:repo-health complete_broad docs/architecture.md, docs/llm-task-artifacts.md, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/runartifact/runartifact.go, internal/stagemodel/resolver.go, internal/threadrespond/threadrespond.go unavailable Review scope was intentionally limited to the assigned changed files and long-term structural risks; I did not inspect unrelated files from the wider change map.
policies:conventions complete_broad docs/architecture.md, docs/development.md, docs/init-ux-contract.md, internal/cmd/configcmd/configcmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/pipeline/pipeline.go unavailable Shared Open CLI Collective source-of-truth docs (../cli-common/docs, ../.github) were not present in this checkout, so findings are limited to repo-local conventions and guardrails visible here.
unassigned incomplete_unassigned unavailable internal/approvaloverride/approvaloverride.go, internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/architecture/thread_lifecycle_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/benchmarkcmd/select.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/gitprovider/types_test.go, internal/ledger/ledger_test.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 5m 40s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 5m 40s wall · 10m 07s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 20s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 4m 35s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 42s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 21s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 7s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread docs/development.md
@@ -13,8 +13,13 @@ session management, and local data lifecycle commands.
The current Go code is a Cobra command tree in `internal/cmd/*` with a thin

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: docs/development.md

The repo-local project overview still summarizes the current CLI surface without mentioning the new cr respond command, even though this PR adds it as a durable root command and the rest of the docs now refer to it. Update the overview so the development guide reflects the public command surface future contributors are expected to preserve.

Reply inline to this comment.

Comment thread docs/architecture.md

## Git Provider Writes

Provider writes have one durable path: planned actions in the ledger followed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states provider writes have one durable path and that commands should not mutate provider state directly, but the new architecture guardrail test does not actually enforce that repo-wide rule: internal/architecture/thread_lifecycle_test.go explicitly exempts all of internal/cmd/reviewcmd from the write-method scan. That means future direct PostIssueComment/ReplyToThread/SubmitReview calls added in command glue would still pass the advertised guardrail. Narrow the allowlist to the specific boundary shim/outbox code, or soften this prose so it reads as an intended design boundary rather than an enforced convention.

Reply inline to this comment.

Comment thread internal/pipeline/pipeline.go Outdated
result.PlannedActions = append(result.PlannedActions, planned)
plannedActions = append(plannedActions, planned)
}
if err := opts.Store.InsertPlanningResult(ctx, ledgerFindings, plannedActions); err != nil {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertPlanningResult now persists findings and planned actions before writeArtifacts and, for dry-runs, before CompleteRun. If either later step fails, the deferred outcome leaves the run incomplete, so the next dry-run resumes the same run in findIncompleteDryRun and reaches this insert again with the same finding/action IDs. That turns a resumable run into a duplicate-row failure instead of finishing the already-planned work. The current resume tests only cover failure before planning persistence (for example a blocking rollup task), so this regression would pass. Fix by making resumed runs detect and reuse already-persisted planning rows, or by moving the resumable boundary so a run is not reused after planning has been durably inserted; add a focused test that forces failure after InsertPlanningResult and verifies the resumed dry-run completes without duplicate insert errors.

Reply inline to this comment.

Comment thread internal/runartifact/runartifact.go Outdated
return llmlifecycle.WriteFileAtomic(MarkerPath(artifactPath, kind), append(data, '\n'))
}

// HasMarker reports whether an artifact root carries the requested run-kind

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new run-kind marker is treated as a durable resume boundary, but HasMarker only checks that a file with the expected name exists. Both cr review and cr respond now trust that boolean when selecting incomplete runs, so a stale/copied artifact tree or wrong-kind marker file can be misclassified as resumable state without validating schema_version, kind, or run_id. That weakens the new artifact contract at exactly the boundary the docs say must fail closed. Replace the existence probe with a real marker reader/validator, and have resume code verify the decoded marker matches the expected kind and the ledger run ID before reusing the artifact root.

Reply inline to this comment.

Comment thread internal/cmd/reviewcmd/reviewcmd.go Outdated
cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cr respond is introduced as a second top-level command from inside reviewcmd, even though the repo’s command surface is otherwise discoverable by one package per top-level command family under internal/cmd/*. Leaving a separate user-facing command embedded here makes ownership, tests, and future docs harder to find, and it invites more respond-specific runtime and flag handling to accumulate in the review package. Move the command registration and CLI glue into a dedicated internal/cmd/respondcmd package, then share the common runtime wiring through a smaller helper package if needed.

Reply inline to this comment.

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread docs/development.md
@@ -13,8 +13,13 @@ session management, and local data lifecycle commands.
The current Go code is a Cobra command tree in `internal/cmd/*` with a thin

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-level note: docs/development.md

The repo-local project overview still summarizes the current CLI surface without mentioning the new cr respond command, even though this PR adds it as a durable root command and the rest of the docs now refer to it. Update the overview so the development guide reflects the public command surface future contributors are expected to preserve.

Reply inline to this comment.

Comment thread docs/architecture.md

## Git Provider Writes

Provider writes have one durable path: planned actions in the ledger followed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states provider writes have one durable path and that commands should not mutate provider state directly, but the new architecture guardrail test does not actually enforce that repo-wide rule: internal/architecture/thread_lifecycle_test.go explicitly exempts all of internal/cmd/reviewcmd from the write-method scan. That means future direct PostIssueComment/ReplyToThread/SubmitReview calls added in command glue would still pass the advertised guardrail. Narrow the allowlist to the specific boundary shim/outbox code, or soften this prose so it reads as an intended design boundary rather than an enforced convention.

Reply inline to this comment.

Comment thread internal/pipeline/pipeline.go Outdated
result.PlannedActions = append(result.PlannedActions, planned)
plannedActions = append(plannedActions, planned)
}
if err := opts.Store.InsertPlanningResult(ctx, ledgerFindings, plannedActions); err != nil {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertPlanningResult now persists findings and planned actions before writeArtifacts and, for dry-runs, before CompleteRun. If either later step fails, the deferred outcome leaves the run incomplete, so the next dry-run resumes the same run in findIncompleteDryRun and reaches this insert again with the same finding/action IDs. That turns a resumable run into a duplicate-row failure instead of finishing the already-planned work. The current resume tests only cover failure before planning persistence (for example a blocking rollup task), so this regression would pass. Fix by making resumed runs detect and reuse already-persisted planning rows, or by moving the resumable boundary so a run is not reused after planning has been durably inserted; add a focused test that forces failure after InsertPlanningResult and verifies the resumed dry-run completes without duplicate insert errors.

Reply inline to this comment.

Comment thread internal/runartifact/runartifact.go Outdated
return llmlifecycle.WriteFileAtomic(MarkerPath(artifactPath, kind), append(data, '\n'))
}

// HasMarker reports whether an artifact root carries the requested run-kind

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new run-kind marker is treated as a durable resume boundary, but HasMarker only checks that a file with the expected name exists. Both cr review and cr respond now trust that boolean when selecting incomplete runs, so a stale/copied artifact tree or wrong-kind marker file can be misclassified as resumable state without validating schema_version, kind, or run_id. That weakens the new artifact contract at exactly the boundary the docs say must fail closed. Replace the existence probe with a real marker reader/validator, and have resume code verify the decoded marker matches the expected kind and the ledger run ID before reusing the artifact root.

Reply inline to this comment.

Comment thread internal/cmd/reviewcmd/reviewcmd.go Outdated
cmd.Flags().BoolVar(&flags.allowSelfApprove, "allow-self-approve", false, "Allow approval when posting identity is the PR author")
cmd.Flags().BoolVar(&flags.noResolveThreads, "no-resolve-threads", false, "Do not plan thread-resolution actions")
rootCmd.AddCommand(cmd)
rootCmd.AddCommand(newRespondCommand(opts, factory))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cr respond is introduced as a second top-level command from inside reviewcmd, even though the repo’s command surface is otherwise discoverable by one package per top-level command family under internal/cmd/*. Leaving a separate user-facing command embedded here makes ownership, tests, and future docs harder to find, and it invites more respond-specific runtime and flag handling to accumulate in the review package. Move the command registration and CLI glue into a dedicated internal/cmd/respondcmd package, then share the common runtime wiring through a smaller helper package if needed.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: ad443c68317d
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 1
structure:repo-health 2
policies:conventions 1
go:implementation-tests (1 finding)

Major - internal/threadrespond/threadrespond.go:169

When Run resumes an incomplete response run and finds persisted planned actions, it reuses them immediately without proving the current inline-thread state still matches the analysis that produced those actions. That means an interrupted live run can later post a stale reply or resolve after a human has already replied, edited context, or resolved the thread, because the only freshness check here is PR head/base SHA. The nearby tests cover stale-input detection only before actions are persisted, and TestRunResumesExistingActionsWithoutReplanning hard-codes the opposite behavior. Persist and validate a thread-context fingerprint (or re-run normalization/analysis whenever thread data changes) before accepting stored actions, and add a test that mutates a thread after actions are saved to prove resume aborts or replans instead of posting stale actions.

structure:repo-health (2 findings)

Major - internal/cmd/respondcmd/respondcmd.go:16

Top-level command packages should stay thin and depend on neutral shared seams, not on sibling command packages. respondcmd now imports internal/cmd/reviewcmd and relies on reviewcmd.NewRuntime, RuntimeFactory, RuntimeOptions, ConfigPath, RetentionPolicyFromConfig, and MapRunError, while reviewcmd had to export those symbols to make respond work. That turns reviewcmd into a de facto shared runtime layer, so every future runtime/bootstrap change for any command will keep accreting in the review command package and blur ownership boundaries across internal/cmd/*. Extract the reusable runtime/bootstrap and shared error/config helpers into a neutral package (for example internal/cmd/runtime or internal/reviewruntime) and keep reviewcmd and respondcmd limited to their own Cobra surface.

Major - docs/architecture.md:22

Versioned architecture guidance should not claim a boundary is enforced more broadly than the guardrails actually check. This doc says the LLM and stage-model boundaries are enforced by internal/architecture/llm_lifecycle_test.go and internal/architecture/model_resolution_test.go, but those tests only catch direct selector calls to llm.RunStructured* and config.ResolveModelTier; they do not enforce the documented ban on provider-adapter usage, hard-coded model IDs, or easy bypasses such as wrapper/re-export paths. Because docs/development.md now points contributors here as the architecture guardrail source, this overstatement creates false confidence and lets structural drift merge silently. Either narrow the prose to the exact invariants the AST tests really cover, or strengthen the tests until they enforce the documented boundary before treating this doc as source-of-truth guidance.

policies:conventions (1 finding)

Minor - docs/architecture.md:16

This commit says the final task marker lives at llm-tasks/<task>/metadata.json, but the companion artifact contract in docs/llm-task-artifacts.md documents task directories as llm-tasks/<encoded-task-id>/. The architecture doc should use the encoded form as well, otherwise it teaches humans to look for a path shape the implementation does not use.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests complete_broad internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/ledger/ledger.go, internal/llmlifecycle/lifecycle.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/reviewplan/reviewplan.go, internal/runartifact/runartifact.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review scope was limited to the assigned Go implementation files plus nearby package tests and docs/development.md guidance.; Workspace was read-only, so findings are based on static inspection rather than running or modifying tests.
structure:repo-health complete_broad docs/architecture.md, docs/development.md, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/review/review.go, internal/reviewplan/reviewplan.go, internal/runartifact/runartifact.go, internal/threadrespond/threadrespond.go unavailable Focused on the assigned files plus directly related architecture tests and docs; this was a read-only structural review, not an exhaustive repo-wide audit or test run.
policies:conventions complete_broad cmd/cr/main.go, docs/architecture.md, docs/development.md, docs/init-ux-contract.md, docs/llm-task-artifacts.md, internal/architecture/model_resolution_test.go, internal/cmd/configcmd/configcmd.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go unavailable Shared Open CLI Collective source-of-truth docs referenced in the prompt were not present in this checkout, so this review is limited to repo-local conventions and the assigned diff context.
unassigned incomplete_unassigned unavailable internal/approvaloverride/approvaloverride.go, internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/thread_lifecycle_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/benchmarkcmd/select.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/cmd/respondcmd/respondcmd_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/gitprovider/types_test.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go, internal/runartifact/runartifact_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 4m 40s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 4m 40s wall · 9m 33s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 20s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 33s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 00s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 2m 25s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 12s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread docs/architecture.md Outdated
code must trust only the final metadata path, never temporary files or partial
payloads.

New LLM-backed components must not call provider adapters or `internal/llm`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Versioned architecture guidance should not claim a boundary is enforced more broadly than the guardrails actually check. This doc says the LLM and stage-model boundaries are enforced by internal/architecture/llm_lifecycle_test.go and internal/architecture/model_resolution_test.go, but those tests only catch direct selector calls to llm.RunStructured* and config.ResolveModelTier; they do not enforce the documented ban on provider-adapter usage, hard-coded model IDs, or easy bypasses such as wrapper/re-export paths. Because docs/development.md now points contributors here as the architecture guardrail source, this overstatement creates false confidence and lets structural drift merge silently. Either narrow the prose to the exact invariants the AST tests really cover, or strengthen the tests until they enforce the documented boundary before treating this doc as source-of-truth guidance.

Reply inline to this comment.

Comment thread docs/architecture.md Outdated
artifacts, session persistence, progress breadcrumbs, and failure
classification.

The final commit marker for a task is `llm-tasks/<task>/metadata.json`. Writers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit says the final task marker lives at llm-tasks/<task>/metadata.json, but the companion artifact contract in docs/llm-task-artifacts.md documents task directories as llm-tasks/<encoded-task-id>/. The architecture doc should use the encoded form as well, otherwise it teaches humans to look for a path shape the implementation does not use.

Reply inline to this comment.


"github.com/open-cli-collective/codereview-cli/internal/cmd/cmderr"
"github.com/open-cli-collective/codereview-cli/internal/cmd/exitcode"
"github.com/open-cli-collective/codereview-cli/internal/cmd/reviewcmd"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level command packages should stay thin and depend on neutral shared seams, not on sibling command packages. respondcmd now imports internal/cmd/reviewcmd and relies on reviewcmd.NewRuntime, RuntimeFactory, RuntimeOptions, ConfigPath, RetentionPolicyFromConfig, and MapRunError, while reviewcmd had to export those symbols to make respond work. That turns reviewcmd into a de facto shared runtime layer, so every future runtime/bootstrap change for any command will keep accreting in the review command package and blur ownership boundaries across internal/cmd/*. Extract the reusable runtime/bootstrap and shared error/config helpers into a neutral package (for example internal/cmd/runtime or internal/reviewruntime) and keep reviewcmd and respondcmd limited to their own Cobra surface.

Reply inline to this comment.

if err != nil {
return result, err
}
if len(existingActions) > 0 {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Run resumes an incomplete response run and finds persisted planned actions, it reuses them immediately without proving the current inline-thread state still matches the analysis that produced those actions. That means an interrupted live run can later post a stale reply or resolve after a human has already replied, edited context, or resolved the thread, because the only freshness check here is PR head/base SHA. The nearby tests cover stale-input detection only before actions are persisted, and TestRunResumesExistingActionsWithoutReplanning hard-codes the opposite behavior. Persist and validate a thread-context fingerprint (or re-run normalization/analysis whenever thread data changes) before accepting stored actions, and add a test that mutates a thread after actions are saved to prove resume aborts or replans instead of posting stale actions.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 5c22e83b2755
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 2
policies:conventions 2
structure:repo-health 2
go:implementation-tests (2 findings)

Major - internal/threadrespond/threadrespond.go:169

When a response run resumes from persisted planned actions, it only revalidates the cached thread fingerprints before reusing those actions. The current request flags and provider capabilities are not reapplied, so a run that originally planned resolve_thread can later be resumed with --no-resolve-threads (or against a provider that no longer advertises thread resolution) and still resolve the thread anyway. That violates the command contract and can post a destructive action the user explicitly disabled. Recheck cached actions against effectiveCaps(opts.Provider.Capabilities(), req) before reuse, or force a fresh replan when current options no longer permit a cached action; add a resume test that flips NoResolveThreads between attempts.

Minor - internal/threadrespond/threadrespond.go:260

Live threadrespond.Result objects return stale planned-action state. After outbox.Post, the code reloads only the run row, so result.PlannedActions still contains the pre-post slice; the retry path has the same problem and also calls resetFailedTerminalResponseActions on struct copies before assigning them back. The store is updated correctly, but callers of the new threadrespond API can still see pending or failed_terminal statuses after a successful retry/post, and the current tests only verify the ledger rows so this regression passes. Reload planned actions from the store before returning live/retry results and add assertions on result.PlannedActions in the live/retry tests.

policies:conventions (2 findings)

Minor - cmd/cr/main.go:57

cr respond is the only top-level command in this diff that cannot register itself through the normal Register hook, so the root builder now has to special-case its default runtime wiring with an inline closure. That drifts from the repo's existing command layout, where cmd/cr stays thin and each command package owns its own default registration (reviewcmd.Register, mecmd.Register, agentscmd.Register, etc.). Add respondcmd.Register(rootCmd, opts) that binds the default factory internally, then pass respondcmd.Register to root.RegisterAll so future root builders and tests do not need command-specific wiring knowledge.

Minor - docs/init-ux-contract.md:286

This fallback label example now misclassifies the open-cli-collective-rianjs-bot reviewer as a PAT reviewer, but the surrounding contract and repo tests treat that identity as the GitHub App example. Leaving it as-is teaches the wrong chooser copy for a durable UX contract. Change the example back to open-cli-collective-rianjs-bot (GitHub App reviewer) or swap in a PAT-shaped credential name instead.

structure:repo-health (2 findings)

Major - internal/threadrespond/threadrespond.go:503

Run allocation/resume plus artifact-marker validation is now a durable lifecycle contract, so it should have one implementation. threadrespond.allocateOrResumeRun/findIncompleteRun duplicates the same responsibilities already implemented in internal/pipeline (findIncompleteDryRun, AllocateRun, runartifact.WriteMarker, marker checks, newest-attempt selection), but with a different query path and separate branching. That duplication will drift as rerun/resume/retention rules evolve, leaving review and respond with subtly different durable-state behavior even though the architecture doc presents one shared lifecycle model. Extract a shared run-lifecycle helper that owns run lookup, artifact-path derivation, marker write/validation, and attempt selection for both review and thread-response kinds, then call it from both packages.

Major - internal/cmd/reviewcmd/reviewcmd.go:120

Reusable runtime wiring should live behind a neutral seam, not be owned by a sibling command package. This PR exports reviewcmd.NewRuntime, builds both the review and respond runners inside reviewcmd, and then injects that factory into respondcmd from cmd/cr, so cr respond now depends on reviewcmd as the owner of shared runtime construction. That makes future command work funnel through a review-specific package and weakens the repo's command/package ownership map. Move newRuntime, runtimeProvider, and the shared runner construction into internal/cmd/cmdruntime or a dedicated non-command runtime package, then have both reviewcmd and respondcmd consume that shared factory directly.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped cmd/cr/main.go, internal/cmd/cmdruntime/cmdruntime.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/respondcmd/respondcmd_test.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/ledger/ledger.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/reviewplan/reviewplan.go, internal/reviewplan/reviewplan_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/approvaloverride/approvaloverride.go, internal/approvaloverride/approvaloverride_test.go, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/architecture/thread_lifecycle_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/benchmarkcmd/select.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/runtime_selection.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/ledger/ledger_test.go, internal/llm/adapter.go, internal/llm/adapter_test.go, internal/llmlifecycle/lifecycle.go, internal/llmlifecycle/lifecycle_test.go, internal/outbox/outbox_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review.go, internal/review/review_test.go, internal/reviewplan/summary.go, internal/runartifact/runartifact.go, internal/runartifact/runartifact_test.go, internal/stagemodel/resolver.go, internal/stagemodel/resolver_test.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go Review scope was limited to Go implementation/test behavior in the changed command, pipeline, runtime, ledger/outbox, reviewplan, and thread-response paths.; Workspace was read-only, so findings are based on static inspection of the diff and nearby tests rather than running or modifying code.; assigned files were neither inspected nor skipped: internal/gitprovider/types_test.go
policies:conventions incomplete_skipped cmd/cr/main.go, docs/architecture.md, docs/development.md, docs/init-ux-contract.md, docs/llm-task-artifacts.md, internal/approvaloverride/approvaloverride.go, internal/cmd/cmdruntime/cmdruntime.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/cmd/reviewcmd/runtime_selection.go, internal/runartifact/runartifact.go, internal/threadrespond/threadrespond.go internal/approvaloverride/approvaloverride_test.go, internal/cmd/configcmd/configcmd.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/respondcmd/respondcmd_test.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/gateio/gateio.go, internal/ledger/ledger.go, internal/outbox/outbox.go, internal/reviewplan/reviewplan.go, internal/reviewplan/summary.go Read-only sandbox; I inspected code and docs only and did not run validation commands.; Review scoped to repo-local and visible shared-convention drift in the assigned diff; I did not have a checked-out local copy of ../cli-common/docs or ../.github, so findings are limited to conventions visible in this repository and assignment context.
structure:repo-health complete_broad cmd/cr/main.go, docs/architecture.md, docs/development.md, internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/architecture/thread_lifecycle_test.go, internal/cmd/cmdruntime/cmdruntime.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/ledger/ledger.go, internal/llmlifecycle/lifecycle.go, internal/outbox/outbox.go, internal/pipeline/pipeline.go, internal/review/review.go, internal/reviewplan/reviewplan.go, internal/runartifact/runartifact.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Scope was intentionally limited to structural/package-boundary, durable-state, and enforceability risks in the assigned diff.; Workspace is read-only in this run, so I did not execute or modify tests; findings are based on code and doc inspection.

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 7m 55s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, policies:conventions, structure:repo-health
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 7m 55s wall · 14m 29s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 33s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 6m 20s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 09s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 4m 04s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 20s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread internal/threadrespond/threadrespond.go
Comment thread cmd/cr/main.go Outdated
mecmd.Register,
agentscmd.Register,
reviewcmd.Register,
func(cmd *cobra.Command, opts *root.Options) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cr respond is the only top-level command in this diff that cannot register itself through the normal Register hook, so the root builder now has to special-case its default runtime wiring with an inline closure. That drifts from the repo's existing command layout, where cmd/cr stays thin and each command package owns its own default registration (reviewcmd.Register, mecmd.Register, agentscmd.Register, etc.). Add respondcmd.Register(rootCmd, opts) that binds the default factory internally, then pass respondcmd.Register to root.RegisterAll so future root builders and tests do not need command-specific wiring knowledge.

Reply inline to this comment.

RegisterWithFactory(rootCmd, opts, newRuntime)
}

// NewRuntime builds the concrete runtime used by review lifecycle commands.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusable runtime wiring should live behind a neutral seam, not be owned by a sibling command package. This PR exports reviewcmd.NewRuntime, builds both the review and respond runners inside reviewcmd, and then injects that factory into respondcmd from cmd/cr, so cr respond now depends on reviewcmd as the owner of shared runtime construction. That makes future command work funnel through a review-specific package and weakens the repo's command/package ownership map. Move newRuntime, runtimeProvider, and the shared runner construction into internal/cmd/cmdruntime or a dedicated non-command runtime package, then have both reviewcmd and respondcmd consume that shared factory directly.

Reply inline to this comment.

return threadanalysis.ResponseActions(analyses)
}

func allocateOrResumeRun(ctx context.Context, opts Options, req Request, pr gitprovider.PR, prKey string, mode ledger.PostMode) (ledger.Run, runartifact.Paths, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run allocation/resume plus artifact-marker validation is now a durable lifecycle contract, so it should have one implementation. threadrespond.allocateOrResumeRun/findIncompleteRun duplicates the same responsibilities already implemented in internal/pipeline (findIncompleteDryRun, AllocateRun, runartifact.WriteMarker, marker checks, newest-attempt selection), but with a different query path and separate branching. That duplication will drift as rerun/resume/retention rules evolve, leaving review and respond with subtly different durable-state behavior even though the architecture doc presents one shared lifecycle model. Extract a shared run-lifecycle helper that owns run lookup, artifact-path derivation, marker write/validation, and attempt selection for both review and thread-response kinds, then call it from both packages.

Reply inline to this comment.

Comment thread docs/init-ux-contract.md Outdated
profile context, for example:

- **open-cli-collective-rianjs-bot (GitHub App reviewer)**
- **open-cli-collective-rianjs-bot (PAT reviewer)**

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback label example now misclassifies the open-cli-collective-rianjs-bot reviewer as a PAT reviewer, but the surrounding contract and repo tests treat that identity as the GitHub App example. Leaving it as-is teaches the wrong chooser copy for a durable UX contract. Change the example back to open-cli-collective-rianjs-bot (GitHub App reviewer) or swap in a PAT-shaped credential name instead.

Reply inline to this comment.

return result, fmt.Errorf("threadrespond: outbox limiter is required")
}

postResult, err := outbox.Post(ctx, outbox.Options{

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Live threadrespond.Result objects return stale planned-action state. After outbox.Post, the code reloads only the run row, so result.PlannedActions still contains the pre-post slice; the retry path has the same problem and also calls resetFailedTerminalResponseActions on struct copies before assigning them back. The store is updated correctly, but callers of the new threadrespond API can still see pending or failed_terminal statuses after a successful retry/post, and the current tests only verify the ledger rows so this regression passes. Reload planned actions from the store before returning live/retry results and add assertions on result.PlannedActions in the live/retry tests.

Reply inline to this comment.

@rianjs-bot

rianjs-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

Automated PR Review

Reviewed commit: 43ea71b6d18a
Profile: codex-rianjs-bot - Posting as: rianjs-bot[bot]

Summary

Reviewer Findings
go:implementation-tests 0
structure:repo-health 2
policies:conventions 0
structure:repo-health (2 findings)

Major - internal/cmd/respondcmd/respondcmd.go:17

The reusable lifecycle runtime is still owned by a sibling command package: respondcmd imports reviewcmd solely to call reviewcmd.NewRuntime. That violates the command-package boundary this refactor is trying to create, because every future lifecycle command now has to depend on reviewcmd internals or duplicate its setup logic. The impact is structural drift around command ownership, with reviewcmd becoming a de facto shared service package. Move NewRuntime/newRuntime and the related provider/runtime construction helpers into internal/cmd/cmdruntime or another non-command package, then have both reviewcmd and respondcmd depend on that shared constructor.

Major - internal/pipeline/pipeline.go:811

Dry-run resume is still implemented as a command-local full-table scan even though this refactor introduced a scoped ledger seam for exactly this lifecycle problem. findIncompleteDryRun calls store.ListRuns() and reimplements the PR/head/profile/posting-identity filtering itself, while cr respond now uses ledger.ListRunsForHeadScope. That leaves review and respond on parallel resume selectors that can drift as run-kind, marker, or retry rules evolve, and it scales resume cost with the entire local run history. Extend the pipeline store interface to use ListRunsForHeadScope, and ideally extract a shared incomplete-run selector so both commands apply the same resume criteria and marker validation from one place.

Reviewer Coverage

Reviewer Status Inspected Skipped Constraints
go:implementation-tests incomplete_skipped internal/architecture/llm_lifecycle_test.go, internal/architecture/model_resolution_test.go, internal/architecture/thread_lifecycle_test.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/respondcmd/respondcmd_test.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gitprovider/github/review_authority.go, internal/gitprovider/operation.go, internal/gitprovider/provider.go, internal/gitprovider/types.go, internal/ledger/ledger.go, internal/ledger/ledger_test.go, internal/outbox/outbox.go, internal/outbox/outbox_test.go, internal/pipeline/pipeline.go, internal/pipeline/pipeline_test.go, internal/reviewplan/reviewplan.go, internal/runartifact/runartifact.go, internal/stagemodel/resolver.go, internal/threadanalysis/threadanalysis.go, internal/threadanalysis/threadanalysis_test.go, internal/threadcontext/threadcontext.go, internal/threadcontext/threadcontext_test.go, internal/threadrespond/threadrespond.go, internal/threadrespond/threadrespond_test.go internal/gitprovider/github/rest_writes.go, internal/llmlifecycle/lifecycle.go, internal/llmlifecycle/lifecycle_test.go, internal/review/review.go, internal/runartifact/runartifact_test.go Review limited to assigned Go implementation and test files in the artifact checkout.; Workspace was read-only, so verification was limited to static inspection of the changed code and tests.
structure:repo-health complete_broad cmd/cr/main.go, docs/architecture.md, docs/development.md, docs/llm-task-artifacts.md, internal/cmd/cmdruntime/cmdruntime.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/ledger/ledger.go, internal/llmlifecycle/lifecycle.go, internal/pipeline/pipeline.go, internal/runartifact/runartifact.go, internal/threadanalysis/threadanalysis.go, internal/threadcontext/threadcontext.go, internal/threadrespond/threadrespond.go unavailable Review was static only; I did not execute tests or commands that mutate the workspace because the environment is read-only.; Scope was intentionally limited to the assigned structural surfaces and their immediate enforcement tests.
policies:conventions complete_broad cmd/cr/main.go, docs/architecture.md, docs/development.md, docs/init-ux-contract.md, docs/llm-task-artifacts.md, internal/approvaloverride/approvaloverride.go, internal/cmd/respondcmd/respondcmd.go, internal/cmd/reviewcmd/reviewcmd.go, internal/cmd/reviewcmd/runtime_provider.go, internal/gitprovider/github/review_authority.go, internal/ledger/ledger.go, internal/pipeline/pipeline.go, internal/runartifact/runartifact.go unavailable Review limited to the assigned files and repo-local conventions visible in this checkout.; Shared Open CLI Collective source-of-truth docs were not inspectable in this environment: outbound network is restricted and the local sibling copy ../cli-common/docs was not present, so I did not raise concerns that depend on unseen shared-policy text.
unassigned incomplete_unassigned unavailable internal/approvaloverride/approvaloverride_test.go, internal/cmd/benchmarkcmd/benchmarkcmd_test.go, internal/cmd/benchmarkcmd/select.go, internal/cmd/cmderr/cmderr.go, internal/cmd/configcmd/configcmd.go, internal/cmd/configcmd/configcmd_test.go, internal/cmd/credentialcmd/credentialcmd_test.go, internal/cmd/datacmd/datacmd.go, internal/cmd/mecmd/mecmd.go, internal/cmd/mecmd/mecmd_test.go, internal/cmd/noleak/noleak_test.go, internal/cmd/respondcmd/respondcmd_test.go, internal/cmd/reviewcmd/adapter_progress.go, internal/cmd/reviewcmd/pipeline_progress.go, internal/cmd/reviewcmd/provider_progress.go, internal/cmd/reviewcmd/reviewcmd_test.go, internal/cmd/reviewcmd/runtime_selection.go, internal/gateio/gateio.go, internal/gitprovider/errors.go, internal/gitprovider/fake.go, internal/gitprovider/fake_test.go, internal/gitprovider/github/client.go, internal/gitprovider/github/client_test.go, internal/gitprovider/github/rest_writes_test.go, internal/gitprovider/types_test.go, internal/llm/adapter.go, internal/llm/adapter_test.go, internal/plannedactions/plannedactions.go, internal/plannedactions/plannedactions_test.go, internal/review/review_test.go, internal/reviewplan/reviewplan_test.go, internal/reviewplan/summary.go, internal/stagemodel/resolver_test.go changed files were not assigned to a selected reviewer

0 PR discussion threads considered. 0 summarized; 0 resolved.


Completed in 9m 30s | unavailable | gpt-5.4 | cr dev
Field Value
Model gpt-5.4
Reviewers go:implementation-tests, structure:repo-health, policies:conventions
Engine codex_cli · gpt-5.4
Reviewed by cr · rianjs-bot[bot]
Duration 9m 30s wall · 14m 06s compute
Cost unavailable
Tokens unavailable

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost Duration
orchestrator-selection gpt-5.4 unavailable unavailable unavailable unavailable unavailable 23s
go:implementation-tests gpt-5.4 unavailable unavailable unavailable unavailable unavailable 7m 58s
structure:repo-health gpt-5.4 unavailable unavailable unavailable unavailable unavailable 3m 46s
policies:conventions gpt-5.4 unavailable unavailable unavailable unavailable unavailable 1m 50s
orchestrator-rollup gpt-5.4 unavailable unavailable unavailable unavailable unavailable 7s

@rianjs-bot rianjs-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR review completed with outcome: request_changes.

Comment thread internal/pipeline/pipeline.go
"github.com/open-cli-collective/codereview-cli/internal/cmd/cmderr"
"github.com/open-cli-collective/codereview-cli/internal/cmd/cmdruntime"
"github.com/open-cli-collective/codereview-cli/internal/cmd/exitcode"
"github.com/open-cli-collective/codereview-cli/internal/cmd/reviewcmd"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reusable lifecycle runtime is still owned by a sibling command package: respondcmd imports reviewcmd solely to call reviewcmd.NewRuntime. That violates the command-package boundary this refactor is trying to create, because every future lifecycle command now has to depend on reviewcmd internals or duplicate its setup logic. The impact is structural drift around command ownership, with reviewcmd becoming a de facto shared service package. Move NewRuntime/newRuntime and the related provider/runtime construction helpers into internal/cmd/cmdruntime or another non-command package, then have both reviewcmd and respondcmd depend on that shared constructor.

Reply inline to this comment.

@rianjs rianjs marked this pull request as ready for review June 27, 2026 13:18
@rianjs rianjs merged commit 53db143 into main Jun 27, 2026
10 checks passed
@rianjs rianjs deleted the feat/thread-lifecycle-architecture branch June 27, 2026 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant