diff --git a/cmd/cr/main.go b/cmd/cr/main.go index d28d8a56..bd0ec582 100644 --- a/cmd/cr/main.go +++ b/cmd/cr/main.go @@ -18,6 +18,7 @@ import ( "github.com/open-cli-collective/codereview-cli/internal/cmd/datacmd" "github.com/open-cli-collective/codereview-cli/internal/cmd/exitcode" "github.com/open-cli-collective/codereview-cli/internal/cmd/mecmd" + "github.com/open-cli-collective/codereview-cli/internal/cmd/respondcmd" "github.com/open-cli-collective/codereview-cli/internal/cmd/reviewcmd" "github.com/open-cli-collective/codereview-cli/internal/cmd/root" "github.com/open-cli-collective/codereview-cli/internal/cmd/sessionscmd" @@ -47,6 +48,16 @@ func buildRootCommand(stdin io.Reader, stdout, stderr io.Writer) (*cobra.Command Stdout: stdout, Stderr: stderr, }) - root.RegisterAll(cmd, opts, configcmd.Register, credentialcmd.Register, mecmd.Register, agentscmd.Register, reviewcmd.Register, sessionscmd.Register, datacmd.Register, benchmarkcmd.Register) + root.RegisterAll(cmd, opts, + configcmd.Register, + credentialcmd.Register, + mecmd.Register, + agentscmd.Register, + reviewcmd.Register, + respondcmd.Register, + sessionscmd.Register, + datacmd.Register, + benchmarkcmd.Register, + ) return cmd, opts } diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 00000000..4275f470 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,118 @@ +# Architecture Decisions + +This document records review-pipeline boundaries that are intended to stay +stable as the implementation evolves. + +## Durable LLM Execution Boundary + +All production structured LLM actions must flow through +`internal/llmlifecycle`. Callers describe the task ID, phase, prompt input, +structured output contract, model/effort, artifact paths, and run/session +scope. The lifecycle runner owns provider invocation, structured-output retry, +provider-session resume, pre-flight reuse, task metadata, accepted-output +artifacts, session persistence, progress breadcrumbs, and failure +classification. + +The final commit marker for a task is +`llm-tasks//metadata.json`. Writers must publish validated +output or failed-attempt payloads first, persist the ledger session row when +the task is run-owned, and write metadata last. Resume code must trust only the +final metadata path, never temporary files or partial payloads. + +New LLM-backed components must not call `internal/llm` structured helpers +directly. They should call `llmlifecycle` through explicit, fakeable +dependencies in unit tests and should return domain results rather than +posting comments or mutating provider state. +`internal/architecture/llm_lifecycle_test.go` enforces that direct structured +helper calls stay inside `internal/llm` and `internal/llmlifecycle`. Direct +provider-adapter calls should also stay behind `llmlifecycle` for production +structured tasks; code review owns that broader boundary until a stronger +static guardrail exists. + +Most lifecycle tasks are run-owned and must have a matching ledger session row +when a provider session is available. Caller-owned no-run tasks are allowed only +where no review run exists yet, such as `SelectionOnly` and the pre-run approval +override classifier. Those tasks may reuse artifact metadata without a ledger +session row, but they still use the same metadata schema and lifecycle runner. + +## Stage Model Resolution + +Runtime model choice must be resolved through `internal/stagemodel`. Code that +executes an LLM stage must not hard-code model IDs and must not call +`config.ResolveModelTier` directly. + +`stagemodel.ResolveStageModel` is the single runtime path from profile +preferences and command overrides to a concrete model and effort. The request +must include the named stage, requested tier, default effort, and any explicit +operator override. The resolver applies user profile `llm.model_map` values, +built-in provider defaults, and configured tier floors before returning the +concrete runtime choice. + +This boundary exists so model catalog data, provider capabilities, token costs, +and profile-level tier floors can be added without touching individual review +stages. Runtime hard-coding bypasses user preference and is a bug. + +Reviewer `agent.model_id` is an exact provider-specific model override. It must +still enter runtime execution through `stagemodel.ResolveStageModel` as a model +override rather than bypassing the resolver, but it intentionally bypasses the +tier map because the agent author selected a concrete model. + +The direct `config.ResolveModelTier` exception is config inspection and the +resolver implementation itself. +`internal/architecture/model_resolution_test.go` enforces that direct +`config.ResolveModelTier` calls stay inside approved packages. Hard-coded +runtime model IDs remain a code-review concern until model-catalog guardrails +exist. + +## Git Provider Writes + +Provider writes have one durable path: planned actions in the ledger followed +by outbox execution. Commands and domain analyzers should not post comments, +reply to review threads, resolve threads, submit reviews, or mutate provider +state directly. + +This keeps markers, retries, reconciliation, idempotency, and resume behavior in +one place. New commands such as `cr respond` should produce planned thread +actions and let the reviewplan/ledger/outbox flow perform provider writes. + +## Inline Thread Lifecycle + +Inline PR discussion threads are domain input, not provider-specific prompt +data. The intended decomposition is: + +- `internal/threadcontext` normalizes `gitprovider.InlineThread`, detects + codereview-authored finding threads, detects latest human replies, strips + shared markers, collapses resolved threads to the latest sanitized comment, + and produces file-scoped reviewer context. +- `internal/threadanalysis` accepts normalized thread context and returns + reusable domain decisions: thread ID, decision, reply body, summary, resolve + flag, and rationale. + +Resolved inline threads should not be reprocessed as full conversations on +every review. Their durable context is the latest sanitized comment on the +resolved thread, with marker metadata retained when the comment contains a +codereview thread-summary marker. Reviewer prompts should receive compact +file-scoped summaries so agents avoid re-raising issues that have already been +discussed and resolved. + +`cr review` and `cr respond` should share the same normalization, filtering, +model resolution, LLM execution, and action-planning components. `cr respond` +is a command-shaped reuse of the thread-response portion of the review +pipeline, not a separate posting system. + +## Retention And Cleanup + +Durable run-owned LLM tasks, thread-analysis results, and artifacts must be +owned by a run and must be safe to delete through the existing data lifecycle +commands. Database rows should reference `runs(run_id)` with cascade delete +semantics, and large artifacts should live under the run artifact directory. + +When the retention window elapses, normal prune/GC commands should remove these +results along with the rest of the run. If a user deletes retained state, a +future review may need to spend time and tokens recreating it; that is the +expected tradeoff for user-controlled local data retention. + +Caller-owned no-run task artifacts must live under the configured data root or +an explicit caller artifact root. If no run is eventually allocated for that +artifact root, the directory is treated as orphaned local data and must be safe +for `cr data purge` to remove. diff --git a/docs/development.md b/docs/development.md index 2bbf9a2a..c2817562 100644 --- a/docs/development.md +++ b/docs/development.md @@ -7,14 +7,20 @@ Collective standards and automation remain canonical in their own repositories. codereview-cli is the Open CLI Collective code-review CLI and ships the `cr` binary. It provides configuration and credential commands, trusted-agent -inspection, dry-run and live pull-request review orchestration, named LLM -session management, and local data lifecycle commands. +inspection, dry-run and live pull-request review orchestration, inline thread +response handling through `cr respond`, named LLM session management, and local +data lifecycle commands. The current Go code is a Cobra command tree in `internal/cmd/*` with a thin `cmd/cr` entrypoint, shared exit-code mapping in `internal/cmd/exitcode`, and version plumbing in `internal/version`. Review orchestration is split across -`internal/pipeline`, `internal/reviewrun`, `internal/reviewplan`, -`internal/outbox`, `internal/gate`, and `internal/gateio`. +`internal/pipeline`, `internal/reviewrun`, `internal/threadrespond`, +`internal/reviewplan`, `internal/outbox`, `internal/gate`, and +`internal/gateio`. + +Architecture guardrails for LLM execution, model resolution, Git provider +writes, inline thread lifecycle, and retention live in +[`docs/architecture.md`](architecture.md). Within `internal/pipeline`, the public entry points are `DryRun`, `Live`, and `SelectionOnly`. `DryRun` and `Live` execute the full review pipeline, while @@ -80,7 +86,8 @@ make clean # remove build artifacts state/config adapters in `internal/config`, `internal/ledger`, and `internal/statepaths`, provider/LLM adapters in their owning packages, and review posting/gating in `internal/outbox`, `internal/gate`, and - `internal/gateio`. + `internal/gateio`, and response-only inline discussion handling in + `internal/threadrespond`. ## Interactive Init Notes diff --git a/docs/init-ux-contract.md b/docs/init-ux-contract.md index 2ac3d200..7627b89c 100644 --- a/docs/init-ux-contract.md +++ b/docs/init-ux-contract.md @@ -29,6 +29,10 @@ Interactive `init` should use these primary user-facing terms: and GitHub Enterprise hosts such as `github.mycompany.com`. - **Reviewer entity**: the actor that posts `COMMENT`, `APPROVE`, or `REQUEST_CHANGES` on pull requests. + On GitHub, a reviewer entity must resolve to a repository-authorized + identity for `APPROVE` or `REQUEST_CHANGES` to count toward PR state. Live + review warns and continues when the selected reviewer can write a review + object but GitHub may not treat it as an opinionated review. - **LLM runtime**: the way reviewer agents run and authenticate, such as Claude CLI subscription auth, Codex CLI subscription auth, Pi local runtime, or a direct API-key-backed provider path. @@ -198,8 +202,8 @@ follow-up credential work without leaking values. All credential-bearing init flows should show equivalent non-secret destination context before collecting secret values. This includes repository-access Git -credentials, reviewer PAT/GitHub App credentials, and LLM API keys handled by -the shared credential collector. +credentials, reviewer PAT credentials, reviewer GitHub App private keys, and +LLM API keys handled by the shared credential collector. Destination summaries should include: @@ -251,7 +255,7 @@ contextual variants of the same fallback choice, such as: - **Post as rianjs (GitHub PAT)** - **Post as acme-review-bot (GitHub App)** -- **Post using this profile's Git account (GitHub PAT)** +- **Post using this profile's Git account (GitHub PAT or GitHub App)** This means: @@ -310,6 +314,9 @@ and saved config must stay stable: profile selects it with `profiles..reviewer.kind: entity` and `profiles..reviewer.entity`. The Git-account fallback maps to `profiles..reviewer.kind: git_identity`. + A GitHub reviewer entity is not just posting credentials: the resolved + identity must also have repository authority for GitHub to count blocking or + approving reviews toward the PR decision. - **Review profile** maps to one saved entry under `profiles.`. This section is intentionally high level. The detailed field inventory and diff --git a/docs/llm-task-artifacts.md b/docs/llm-task-artifacts.md index e36735d0..0907c983 100644 --- a/docs/llm-task-artifacts.md +++ b/docs/llm-task-artifacts.md @@ -5,7 +5,7 @@ reviewer, and rollup calls must be isolated from each other so one failed task does not erase successful upstream work or force unrelated LLM sessions to run again. -Task artifacts live under a run artifact directory: +Task artifacts usually live under a run artifact directory: ```text llm-tasks// @@ -18,6 +18,8 @@ llm-tasks// Raw failed-attempt files are named `