Skip to content

Add regression coverage for worker reload contracts#123

Merged
leynos merged 99 commits into
mainfrom
issue-16-terminal-job-state-persistence
Apr 15, 2026
Merged

Add regression coverage for worker reload contracts#123
leynos merged 99 commits into
mainfrom
issue-16-terminal-job-state-persistence

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Apr 7, 2026

Replace the SIGHUP scaffolding with executable integration tests and add worker-orchestrator contract checks plus terminal state persistence characterisation tests so these boundaries stay aligned.

Summary by Sourcery

Add regression tests to pin down worker–orchestrator HTTP contracts and terminal job-state persistence, and replace the previous SIGHUP scaffolding test with concrete webhook reload integration tests.

Enhancements:

  • Refactor worker test setup to share common dependency construction helpers across tests.
  • Introduce a capturing in-memory database implementation to observe job persistence behavior without a real backend.

Tests:

  • Add characterisation tests ensuring terminal job-state transitions correctly persist status and result events to the job store, including double-transition rejection.
  • Replace placeholder SIGHUP reload test scaffolding with real integration tests that exercise webhook server address changes, secret hot-swapping, and rollback on bind failures via the in-process HTTP server.
  • Add worker–orchestrator contract tests asserting path/method alignment, auth behavior, JSON payload round-trips, and finish-reason compatibility between worker client types and orchestrator API routes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 7, 2026

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Summarise worker terminal-state transitions into a single durable persistence call with in-memory rollback on failure; add test harnesses, null/capturing DB doubles and new tests; introduce listener-based WebhookServer lifecycle; extract delegate-based agent dispatcher with phased tool execution; decompose thread-ops into focused modules.

Changes

Cohort / File(s) Summary
Worker Terminal & Context
src/worker/job.rs, src/context/state.rs
Centralise terminal transitions via transition_terminal_state; perform one atomic DB write with persist_terminal_result_and_status; expose job/context accessors and add set_state_rollback for in-memory rollback when persistence fails; refactor terminal mark helpers and update tests.
DB Traits & Backends
src/db/traits/database.rs, src/db/traits/mod.rs, src/db/mod.rs, src/db/forwarders.rs, src/db/libsql/..., src/db/postgres/..., src/history/store/jobs.rs
Add TerminalJobPersistence and persist_terminal_result_and_status to trait surface and dyn forwarder; implement transactional atomic persistence in libSQL/Postgres/history backends with source/row-count validation and unit/integration tests.
Null / Capturing Test DB
src/testing/null_db/mod.rs, src/testing/null_db/null_database/*, src/testing/null_db/capturing_store/*
Add NullDatabase (no-op, synthetic stable UUIDs) and CapturingStore recording status/event calls; implement delegation layer that captures terminal persistence/status updates for assertions.
Worker Test Harness & DTO Tests
src/testing/worker_harness.rs, src/testing/mod.rs, tests/worker_orchestrator_contract.rs, tests/worker_orchestrator_json_shapes.rs
Add StubLlm, worker builders (make_worker*), capturing-store variants, terminal-persistence assertion helpers (including insta snapshots), and contract/JSON-shape tests validating route helpers and DTO symmetry.
Webhook Server Listener API & Tests
src/channels/webhook_server.rs, tests/webhook_server.rs, tests/infrastructure/sighup_reload.rs, tests/support/webhook_helpers.rs, tests/support/mod.rs, tests/infrastructure.rs
Add start_with_listener/restart_with_listener, spawn_on_listener and swap_listener restart kernel with restore-on-failure semantics; derive config.addr from listener; convert SIGHUP-style tests into in-process Tokio tests with shared fixtures.
Agent Dispatcher: Delegate & Tool Exec
src/agent/dispatcher/delegate/..., src/agent/dispatcher/mod.rs, src/agent/dispatcher/delegate/llm_hooks.rs
Extract ChatDelegate, add LLM lifecycle hooks, split tool execution into preflight/execution/postflight/recording phases; add ToolCallSpec and parallel/batched execution with JoinSet; update call sites and re-exports.
Thread Ops Decomposition
src/agent/thread_ops.rs, src/agent/thread_ops/*
Decompose thread operations into modules (control, hydration, dispatch, turn_preparation, turn_execution, compaction/checkpointing, finalisation, approval, persistence); add TurnPersistContext, UserTurnRequest, prepare/validate/compact/checkpoint flows and persistence helpers.
Session / Turn Tool Result APIs
src/agent/session.rs, src/agent/session/tests/record_tool_result_content.rs
Add indexed tool-result APIs and JSON-aware parsing for tool output; provide convenience helpers to record parsed content and add unit tests for parsing/recording.
Types, Defaults & Dev-deps
src/worker/api/types.rs, src/llm/provider.rs, Cargo.toml
Derive Serialize for PromptResponse; add Default for CompletionResponse/ToolCompletionResponse and FinishReason::Stop as default; adjust dev-deps (insta json feature, proptest pinning, add delegate).
Docs & Test Support
docs/testing-abstractions.md, docs/developers-guide.md, docs/webhook-server-design.md, tests/support/*
Add testing-abstractions guide and webhook listener lifecycle docs; add webhook test helpers and wiring; update test support module declarations and signature-pinning helpers.

Sequence Diagram(s)

sequenceDiagram
  participant Worker as Worker
  participant Context as ContextManager
  participant DB as Database
  participant Broad as Broadcaster

  Worker->>Context: transition_terminal_state(job_id) — capture prior state, set in-memory terminal
  alt prior state non-terminal
    Worker->>DB: persist_terminal_result_and_status(TerminalJobPersistence)
    alt persistence success
      DB-->>Worker: Ok
      Worker->>Broad: broadcast SSE "result" event (after durable write)
      Worker-->>Worker: return Ok
    else persistence failure
      DB-->>Worker: Err
      Worker->>Context: set_state_rollback(previous)
      Worker-->>Worker: return persistence error
    end
  else prior state already terminal
    Worker-->>Worker: reject duplicate terminal transition (no DB call)
  end
Loading

Possibly related issues

Suggested labels

Issue

Poem

🔧 Persist and roll back, keep state in line,
🧪 Capturing doubles log each verdict fine,
🌐 Hand over listeners, restart with care,
🤖 Delegate the chat, split tools everywhere,
🔁 Threads hydrate, checkpoint — tests declare.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-16-terminal-job-state-persistence

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 7, 2026

Reviewer's Guide

Adds regression-style tests to lock in worker/orchestrator HTTP contracts and worker terminal state persistence, refactors worker test setup to share dependencies, and replaces placeholder SIGHUP reload tests with executable integration-style tests over WebhookServer/HttpChannel.

Sequence diagram for worker terminal state persistence characterisation

sequenceDiagram
    actor Test
    participant Worker
    participant ContextManager
    participant CapturingStore

    Test->>Worker: make_worker_with_capturing_store()
    Worker-->>Test: Worker, CapturingStore

    Test->>Worker: transition_to_in_progress()
    Worker->>ContextManager: update_context(job_id, set state InProgress)
    ContextManager-->>Worker: JobContext(InProgress)
    Worker-->>Test: ok

    Test->>Worker: mark_failed("budget exceeded")
    Worker->>ContextManager: update_context(job_id, transition_to Failed)
    ContextManager-->>Worker: JobContext(Failed)
    Worker->>CapturingStore: update_job_status(job_id, Failed, "budget exceeded")
    CapturingStore-->>Worker: ok
    Worker->>CapturingStore: save_job_event(job_id, result, {status: failed})
    CapturingStore-->>Worker: ok
    Worker-->>Test: ok

    Test->>ContextManager: get_context(job_id)
    ContextManager-->>Test: JobContext(Failed)

    Test->>CapturingStore: captured_calls()
    CapturingStore-->>Test: [UpdateJobStatus, SaveJobEvent]
    Test->>Test: assert exactly one status update and one result event
Loading

Class diagram for worker terminal state persistence test harness

classDiagram
    class Worker {
        +Uuid job_id
        +WorkerDeps deps
        +mark_completed() Result
        +mark_failed(reason)
        +mark_stuck(reason)
        +context_manager() ContextManager
    }

    class WorkerDeps {
        +ContextManager context_manager
        +StubLlm llm
        +SafetyLayer safety
        +ToolRegistry tools
        +Database store
        +HookRegistry hooks
        +Duration timeout
        +bool use_planning
        +Sender sse_tx
        +ApprovalContext approval_context
        +HttpInterceptor http_interceptor
    }

    class ContextManager {
        +create_job(user, description) Result~Uuid~
        +get_context(job_id) Result~JobContext~
        +update_context(job_id, updater) Result~JobContext~
    }

    class JobContext {
        +JobState state
        +transition_to(state, reason) Result
    }

    class JobState {
        <<enum>>
        Pending
        InProgress
        Completed
        Failed
        Stuck
    }

    class CapturedCall {
        <<enum>>
        UpdateJobStatus
        SaveJobEvent
    }

    class CapturingStore {
        +Mutex~Vec~ calls
        +new() CapturingStore
        +captured_calls() Vec~CapturedCall~
        +doc_not_found(doc_type) WorkspaceError
    }

    class Database {
        <<interface>>
    }

    class NativeDatabase {
        <<interface>>
        +run_migrations() Result
    }

    class NativeJobStore {
        <<interface>>
        +save_job(ctx) Result
        +get_job(id) Result~JobContext~
        +update_job_status(id, status, failure_reason) Result
        +mark_job_stuck(id) Result
        +get_stuck_jobs() Result~Vec~
        +save_action(job_id, action) Result
        +get_job_actions(job_id) Result~Vec~
        +record_llm_call(record) Result~Uuid~
        +save_estimation_snapshot(params) Result~Uuid~
        +update_estimation_actuals(params) Result
    }

    class NativeSandboxStore {
        <<interface>>
        +save_sandbox_job(job) Result
        +get_sandbox_job(id) Result
        +list_sandbox_jobs() Result
        +update_sandbox_job_status(params) Result
        +cleanup_stale_sandbox_jobs() Result
        +sandbox_job_summary() Result
        +list_sandbox_jobs_for_user(user_id) Result
        +sandbox_job_summary_for_user(user_id) Result
        +sandbox_job_belongs_to_user(job_id, user_id) Result
        +update_sandbox_job_mode(id, mode) Result
        +get_sandbox_job_mode(id) Result
        +save_job_event(job_id, event_type, data) Result
        +list_job_events(job_id, before_id, limit) Result
    }

    class NativeConversationStore {
        <<interface>>
    }

    class NativeRoutineStore {
        <<interface>>
    }

    class NativeToolFailureStore {
        <<interface>>
    }

    class NativeSettingsStore {
        <<interface>>
    }

    class NativeWorkspaceStore {
        <<interface>>
    }

    class ToolRegistry {
        +new() ToolRegistry
        +register(tool)
    }

    class Tool {
        <<interface>>
    }

    class StubLlm {
    }

    class SafetyLayer {
        +new(config) SafetyLayer
    }

    class SafetyConfig {
        +u64 max_output_length
        +bool injection_check_enabled
    }

    class HookRegistry {
        +new() HookRegistry
    }

    class ApprovalContext {
    }

    class HttpInterceptor {
    }

    class Uuid {
    }

    class Duration {
    }

    Worker --> WorkerDeps
    WorkerDeps --> ContextManager
    WorkerDeps --> StubLlm
    WorkerDeps --> SafetyLayer
    WorkerDeps --> ToolRegistry
    WorkerDeps --> Database
    WorkerDeps --> HookRegistry
    WorkerDeps --> Duration
    WorkerDeps --> ApprovalContext
    WorkerDeps --> HttpInterceptor

    ContextManager --> JobContext
    JobContext --> JobState

    CapturingStore ..|> Database
    CapturingStore ..|> NativeDatabase
    CapturingStore ..|> NativeJobStore
    CapturingStore ..|> NativeSandboxStore
    CapturingStore ..|> NativeConversationStore
    CapturingStore ..|> NativeRoutineStore
    CapturingStore ..|> NativeToolFailureStore
    CapturingStore ..|> NativeSettingsStore
    CapturingStore ..|> NativeWorkspaceStore

    CapturingStore --> CapturedCall
    CapturedCall --> JobState

    ToolRegistry --> Tool
    SafetyLayer --> SafetyConfig
    HookRegistry --> WorkerDeps

    class Mutex {
    }

    class Vec {
    }

    CapturingStore --> Mutex
    Mutex --> Vec

    class WorkspaceError {
    }

    CapturingStore --> WorkspaceError
    NativeJobStore --> Uuid
    NativeSandboxStore --> Uuid
    NativeDatabase --> WorkspaceError
    NativeWorkspaceStore --> WorkspaceError
Loading

File-Level Changes

Change Details Files
Refactor worker test helpers to share dependency construction and support pluggable stores.
  • Extract ToolRegistry creation into build_registry helper and share across tests.
  • Introduce base_deps helper to construct WorkerDeps with configurable ToolRegistry, optional Database store, and optional approval context.
  • Update existing make_worker, make_worker_with_db, and make_worker_with_approval_context helpers to use the shared helpers and Arc-wrapped ToolRegistry instances.
src/worker/job.rs
Introduce a capturing in-memory Database implementation and characterisation tests for terminal job-state persistence and event emission.
  • Add CapturedCall enum and CapturingStore struct that implements Database-related traits, recording update_job_status and save_job_event invocations.
  • Provide a make_worker_with_capturing_store helper wiring Worker to CapturingStore via base_deps.
  • Add async helpers to drive job to InProgress and to assert that terminal transitions persist expected JobState, failure reason, and result job events.
  • Add tests that exercise mark_completed, mark_failed, mark_stuck, and double-completion scenarios, verifying ContextManager state and captured DB/event calls.
src/worker/job.rs
Replace ignored SIGHUP reload scaffolding with concrete integration tests driving WebhookServer and HttpChannel reload behaviour.
  • Switch tests to construct WebhookServer instances directly, binding to ephemeral addresses and adding minimal health routes.
  • Add helper to create ephemeral addresses and a health_server constructor plus a post_webhook helper for sending webhook payloads.
  • Implement address-change test that verifies restart_with_addr moves the listener to a new address and the old address stops responding.
  • Implement zero-downtime secret update test by wiring HttpChannel routes into WebhookServer, hot-swapping secrets via shared_state, and asserting old secret rejection/new secret acceptance without restart.
  • Implement rollback-on-bind-failure test that forces restart_with_addr to fail via an occupied port and asserts original listener continues serving and current_addr is unchanged.
tests/infrastructure/sighup_reload.rs
Add worker–orchestrator contract tests covering route paths, HTTP methods, auth conventions, JSON shapes, and finish-reason handling.
  • Introduce a StubLlm implementing NativeLlmProvider and a make_state helper to build an OrchestratorState with ContainerJobManager and ToolRegistry for testing OrchestratorApi::router.
  • Add tests ensuring worker job_scoped_path and worker_job_url utilities produce paths matching OrchestratorApi route constants.
  • Define a ROUTE_METHOD_TABLE and rstest-based tests that assert wrong HTTP methods yield 405 and that health vs worker routes obey expected auth behaviour.
  • Add helpers to exercise all authenticated routes and tests that 401 is returned for missing auth header, wrong bearer token, or valid token bound to a different job id.
  • Add JSON round-trip/fixture tests for StatusUpdate, JobEventPayload, CompletionReport, RemoteToolExecutionRequest, ProxyToolCompletionRequest, ProxyCompletionResponse, JobDescription, RemoteToolCatalogResponse, CredentialResponse, and PromptResponse to lock in payload shapes.
  • Add tests for ProxyFinishReason deserialization aliases, including mapping "tool_calls" to ToolUse and unknown strings to Unknown.
tests/worker_orchestrator_contract.rs

Assessment against linked issues

Issue Objective Addressed Explanation
#16 Replace the existing SIGHUP integration scaffolding with real, executable tests that exercise the webhook hot-reload path end-to-end.
#16 Introduce worker–orchestrator contract tests that enforce parity for routes, HTTP methods, auth conventions, and JSON payloads between worker client APIs and orchestrator HTTP endpoints.
#16 Add characterisation tests for terminal job-state persistence in the worker (covering persist_status, log_event, and terminal state transitions in src/worker/job.rs).

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 7, 2026

@coderabbitai please create a new GitHub issue for the following:

src/worker/job.rs

Comment on file

    };
    use crate::safety::SafetyLayer;
    use crate::tools::{NativeTool, Tool, ToolError as ToolExecError, ToolOutput};
    use tokio::sync::Mutex;

❌ Getting worse: Low Cohesion
The number of different responsibilities increases from 8 to 74, threshold = 3

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 7, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

src/worker/job.rs

Comment on file

    };
    use crate::safety::SafetyLayer;
    use crate::tools::{NativeTool, Tool, ToolError as ToolExecError, ToolOutput};
    use tokio::sync::Mutex;

❌ Getting worse: Code Duplication
introduced similar code in: tests.get_document_by_id,tests.get_document_by_path,tests.get_or_create_document_by_path,tests.make_worker_with_capturing_store and 3 more functions

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 7, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

tests/worker_orchestrator_contract.rs

Comment on file

    async fn complete(
        &self,
        _req: CompletionRequest,
    ) -> Result<CompletionResponse, ironclaw::error::LlmError> {
        Ok(CompletionResponse {
            content: String::new(),
            input_tokens: 0,
            output_tokens: 0,
            finish_reason: FinishReason::Stop,
            cache_read_input_tokens: 0,
            cache_creation_input_tokens: 0,
        })
    }

❌ New issue: Code Duplication
The module contains 5 functions with similar structure: NativeLlmProvider.complete,NativeLlmProvider.complete_with_tools,no_auth_header_yields_unauthorized,valid_token_wrong_job_yields_unauthorized and 1 more functions

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 7, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo leynos/axinite \
  --title "Refactor src/worker/job.rs to reduce Low Cohesion (responsibilities 74 → ≤3)" \
  --assignee leynos \
  --body "## Low Cohesion in \`src/worker/job.rs\`

**CodeScene** has flagged a significant worsening of the **Low Cohesion** biomarker in [\`src/worker/job.rs\`](https://github.com/leynos/axinite/blob/main/src/worker/job.rs):

> The number of different responsibilities increases from **8 to 74**, threshold = **3**

This was identified during PR `#123` (Add regression coverage for worker reload contracts).

### Why this matters

High numbers of distinct responsibilities in a single module indicate that the file is doing too much. This makes the code harder to understand, test, and maintain, and increases the risk of unintended coupling between concerns.

### Suggested next steps

- Audit the responsibilities now present in `src/worker/job.rs` and group related ones.
- Extract cohesive subsets into dedicated modules (e.g. separate tool invocation, safety checks, state management, and event emission).
- Review imports such as `SafetyLayer`, `NativeTool`/`Tool`/`ToolError`/`ToolOutput`, and `tokio::sync::Mutex` to determine whether they belong in a thinner coordination layer or in dedicated sub-modules.

### References

- PR: `#123`
- CodeScene delta report: https://codescene.io/projects/77987/delta?repo-id=1501128&review-id=123&biomarker=Low+Cohesion&filename=src%2Fworker%2Fjob.rs&method=

/cc `@leynos`"

Length of output: 396


@leynos: The issue has been created and assigned to you:

#124 — Refactor src/worker/job.rs to reduce Low Cohesion (responsibilities 74 → ≤3)

It documents the CodeScene finding, explains the impact, and outlines suggested next steps for decomposing the module into more cohesive sub-modules.

@github-actions github-actions Bot added the scope: worker Container worker label Apr 7, 2026
@coderabbitai

This comment was marked as resolved.

@coderabbitai

This comment was marked as resolved.

@github-actions github-actions Bot added size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Apr 7, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 7, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph.

If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

src/worker/job.rs

Comment on file

    };
    use crate::safety::SafetyLayer;
    use crate::tools::{NativeTool, Tool, ToolError as ToolExecError, ToolOutput};
    use tokio::sync::Mutex;

❌ Getting worse: Code Duplication
introduced similar code in: tests.get_document_by_id,tests.get_document_by_path,tests.get_or_create_document_by_path,tests.test_double_completed_transition_rejected and 3 more functions

@coderabbitai

This comment was marked as resolved.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 7, 2026

@coderabbitai please create a new GitHub issue for the following:

src/worker/job.rs

Comment on file

    };
    use crate::safety::SafetyLayer;
    use crate::tools::{NativeTool, Tool, ToolError as ToolExecError, ToolOutput};
    use tokio::sync::Mutex;

❌ Getting worse: Low Cohesion
The number of different responsibilities increases from 8 to 75, threshold = 3

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 7, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo leynos/axinite \
  --title "Refactor src/worker/job.rs to reduce Low Cohesion (responsibilities 75 → ≤3)" \
  --assignee leynos \
  --body "## Low Cohesion in \`src/worker/job.rs\`

**CodeScene** has flagged a continued worsening of the **Low Cohesion** biomarker in [\`src/worker/job.rs\`](https://github.com/leynos/axinite/blob/main/src/worker/job.rs):

> The number of different responsibilities increases from **8 to 75**, threshold = **3**

This was identified during PR \`#123\` (Add regression coverage for worker reload contracts). A prior issue (\`#124\`) was raised when the count reached 74; it has since increased by a further responsibility.

### Why this matters

High numbers of distinct responsibilities in a single module indicate that the file is doing too much. This makes the code harder to understand, test, and maintain, and increases the risk of unintended coupling between concerns.

### Suggested next steps

- Audit the responsibilities now present in \`src/worker/job.rs\` and group related ones.
- Extract cohesive subsets into dedicated modules (e.g. separate tool invocation, safety checks, state management, and event emission).
- Review imports such as \`SafetyLayer\`, \`NativeTool\`/\`Tool\`/\`ToolError\`/\`ToolOutput\`, and \`tokio::sync::Mutex\` to determine whether they belong in a thinner coordination layer or in dedicated sub-modules.
- Address the parallel Code Duplication findings raised in the same PR (already documented with fix suggestions in the PR comment thread).

### References

- PR: \`#123\`
- Related issue: \`#124\` (Low Cohesion at 74 responsibilities)
- CodeScene delta report: https://codescene.io/projects/77987/delta?repo-id=1501128&review-id=123&biomarker=Low+Cohesion&filename=src%2Fworker%2Fjob.rs&method=

/cc \`@leynos\`"

Length of output: 44


@leynos: The issue has been created and assigned to you:

#127 — Refactor src/worker/job.rs to reduce Low Cohesion (responsibilities 75 → ≤3)

It documents the updated CodeScene finding (responsibilities now at 75, up from 74 in #124), cross-references the earlier issue and PR #123, and outlines the same suggested decomposition steps.

@leynos leynos marked this pull request as ready for review April 7, 2026 23:00
sourcery-ai[bot]

This comment was marked as resolved.

@coderabbitai coderabbitai Bot added the Roadmap label Apr 7, 2026
chatgpt-codex-connector[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@github-actions github-actions Bot added the scope: dependencies Dependency updates label Apr 8, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@coderabbitai coderabbitai Bot removed the Roadmap label Apr 8, 2026
Split call_llm into dedicated guardrail, retry, and cost-recording helpers so the main delegate hook stays below the target size and each phase is easier to follow.\n\nThis preserves the existing call flow and retry semantics while narrowing call_llm to the orchestration steps.
codescene-delta-analysis[bot]

This comment was marked as outdated.

leynos added 2 commits April 15, 2026 12:49
Factor the shared terminal transition, atomic persistence, and rollback-on-error flow into a private Worker helper and delegate the three terminal state methods to it.\n\nThis preserves the existing payload strings and rollback behaviour while removing duplicated terminal update logic.
Replace the semicolon before “and” in the listener-based lifecycle bullet list with a comma, as requested.\n\nNo other documentation content changed.
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 15, 2026

@coderabbitai Have the following now been resolved?

Please address the following concerns and ensure all commit gates succeed:

⚠️ Outside diff range comments (1)

src/worker/job.rs (1)

1001-1051: 🛠️ Refactor suggestion | 🟠 Major
Extract one terminal-transition helper.
Collapse the shared snapshot → transition → result-event persist →
status-persist → rollback flow into one helper parameterized by target state and
payload. Keep this logic in one place so the rollback fix and persistence
contract cannot drift across three copies.
As per coding guidelines, "Adhere to the Single Responsibility Principle:
ensure each function and method has one clear, well-defined responsibility;
decompose functions that handle multiple distinct logical blocks immediately"
and "Apply Extract Method refactoring to identify and extract logical chunks
from complex functions; encapsulate each chunk into its own well-named method
or function."
Also applies to: 1089-1178

🤖 Prompt for AI Agents

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Ensure we are on the PR branch already (runner clones PR branch by default).
echo "== Search for an extracted helper =="
rg -nP --context 2 'async\s+fn\s+apply_terminal_transition' src/worker/job.rs || true

echo
echo "== Show terminal methods =="
rg -nP --context 4 'async\s+fn\s+mark_(completed|failed|stuck)\b' src/worker/job.rs || true

echo
echo "== Grep for transition_terminal_state and persist_terminal_result_and_status usage near terminal methods =="
rg -nP -C2 'transition_terminal_state\(|persist_terminal_result_and_status\(' src/worker/job.rs || true

Length of output: 2708


@leynos Yes — resolved.

Evidence from the current branch:

  • Helper extracted: apply_terminal_transition is defined (Line 1096).
  • Delegation in terminal methods:
    • mark_completed calls self.apply_terminal_transition(..., "completed", ...). (Lines 1050–1056)
    • mark_failed calls self.apply_terminal_transition(..., "failed", ...). (Lines 1133–1137)
    • mark_stuck calls self.apply_terminal_transition(..., "stuck", ...). (Lines 1154–1158)
  • The helper orchestrates the full flow:
    • Captures previous via transition_terminal_state(...) (Line 1108),
    • Builds the “result” event JSON (Lines 1109–1113),
    • Persists atomically via persist_terminal_result_and_status(...) (Lines 1115–1117),
    • Rolls back on error (immediately following those lines).

This collapses the shared snapshot → transition → atomic persist → rollback pattern into a single, well-named function and removes the drift risk across the three mark_* methods. No further changes needed for this item.

Address the remaining verified review findings across dispatcher,\nsession, persistence, context, null-db, test support, and docs.\n\nThis batch adds compaction property coverage, restores safety and\nerror propagation in tool and session paths, consolidates duplicated\nrollback tests, documents atomic terminal persistence, and checks in\nthe new e2e/proptest artefacts needed to keep the revised tests\nstable.
codescene-delta-analysis[bot]

This comment was marked as outdated.

Align the public ToolCallIndexError message with the requested\ncontract and add regression tests for out-of-bounds indexed tool\nresult and error writes.\n\nThe indexed Turn helpers and production call-site handling were\nalready in place, so this commit closes the remaining API and test\ngaps without changing runtime behaviour.
codescene-delta-analysis[bot]

This comment was marked as outdated.

Update the module-level test documentation to use the requested\nJSON-aware wording while keeping the file layout unchanged.
codescene-delta-analysis[bot]

This comment was marked as outdated.

Align the Postgres rollback regression with the requested two-case\nrstest shape and explicit invalid-job scenarios while preserving the\nexisting rollback assertions and cleanup flow.
codescene-delta-analysis[bot]

This comment was marked as outdated.

Change the null database UUID/cache helpers to return DatabaseError\ninstead of silently recovering from poisoned locks, and thread the\nnew Result through the null-db conversation, job, and workspace\nhelpers.\n\nThe requested null-db library test remains blocked by unrelated\npre-existing compile failures in src/testing/mod.rs.
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 15, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph.

If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

Code Duplication

src/agent/session.rs:

What lead to degradation?

The module contains 4 functions with similar structure: Thread.new,Thread.with_id,Turn.record_tool_error_at,Turn.record_tool_result_at

Why does this problem occur?

Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?

A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More
Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

Remove stale test-only imports and gate harness tests behind the same\nlibsql plus test-helpers feature combination that provides the\nTestHarnessBuilder::build path.\n\nThis resolves the unused-import lints and the lib-test type errors in\nsrc/testing while keeping the existing test logic unchanged.
codescene-delta-analysis[bot]

This comment was marked as outdated.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

Tip

For best results, initiate chat on the files or code changes.

@leynos

File: src/agent/session.rs
Symbols: Thread::new, Thread::with_id, Turn::record_tool_result_at, Turn::record_tool_error_at

Proposed fix

  • Remove the duplication by introducing two private helpers and delegating the public functions to them. This keeps the public API and semantics unchanged and centralises the mirrored logic.
  1. Deduplicate Thread constructors
    Add one private constructor that does the full struct initialisation given a session_id and a specific thread_id; call it from both Thread::new (generates a fresh UUID) and Thread::with_id (uses the provided UUID).

Rust (shape-only)

impl Thread {
    // Private: single place for initialising a Thread
    fn init(session_id: Uuid, thread_id: Uuid) -> Self {
        Thread {
            id: thread_id,
            session_id,
            // ... all the other default fields exactly as used today ...
            // e.g., turns: Vec::new(), state: ThreadState::Idle, updated_at: Utc::now(), etc.
        }
    }

    pub fn new(session_id: Uuid) -> Self {
        let thread_id = Uuid::new_v4();
        Self::init(session_id, thread_id)
    }

    pub fn with_id(thread_id: Uuid, session_id: Uuid) -> Self {
        Self::init(session_id, thread_id)
    }
}
  1. Deduplicate Turn outcome writers
    Add one private helper that sets a result or an error at a given index (clearing the other), and have both record_tool_result_at and record_tool_error_at call it. This preserves the mutability model and indexing behaviour of your current code.

Rust (shape-only)

impl Turn {
    // Private: updates a single tool_call outcome at `idx`
    fn set_tool_outcome_at(
        &mut self,
        idx: usize,
        result: Option<serde_json::Value>,
        error: Option<String>,
    ) {
        if let Some(tc) = self.tool_calls.get_mut(idx) {
            tc.result = result;
            tc.error  = error;
        } else {
            // Preserve today’s behaviour: if you currently panic/expect on OOB,
            // replicate it here instead of silently ignoring.
            // e.g., panic!("tool call index {} out of bounds", idx);
        }
    }

    pub fn record_tool_result_at(&mut self, idx: usize, result: serde_json::Value) {
        self.set_tool_outcome_at(idx, Some(result), None);
    }

    pub fn record_tool_error_at(&mut self, idx: usize, error: impl Into<String>) {
        self.set_tool_outcome_at(idx, None, Some(error.into()));
    }
}

Why this resolves the finding

  • The two Thread constructors now share a single initialisation path, removing duplicated field blocks.
  • The two Turn writers now share one path that enforces the “result xor error” invariant, reducing risk of drift.

AI coding‑agent prompt

You are modifying a Rust crate.

Goal
Eliminate the Code Duplication flagged in src/agent/session.rs for:
- Thread::new and Thread::with_id
- Turn::record_tool_result_at and Turn::record_tool_error_at

Edits
1) In src/agent/session.rs, inside impl Thread:
   - Add a private fn init(session_id: Uuid, thread_id: Uuid) -> Self that builds a Thread with exactly the same field initialisation the constructors currently use.
   - Rewrite:
     pub fn new(session_id: Uuid) -> Self {
       let thread_id = Uuid::new_v4();
       Self::init(session_id, thread_id)
     }
     pub fn with_id(thread_id: Uuid, session_id: Uuid) -> Self {
       Self::init(session_id, thread_id)
     }

2) In the same file, inside impl Turn:
   - Add a private fn set_tool_outcome_at(&mut self, idx: usize, result: Option<serde_json::Value>, error: Option<String>).
     Body: mutate self.tool_calls[idx], set tc.result = result; tc.error = error. Mirror today’s OOB behaviour (panic/expect or ignore) so semantics remain unchanged.
   - Rewrite:
     pub fn record_tool_result_at(&mut self, idx: usize, result: serde_json::Value) {
       self.set_tool_outcome_at(idx, Some(result), None);
     }
     pub fn record_tool_error_at(&mut self, idx: usize, error: impl Into<String>) {
       self.set_tool_outcome_at(idx, None, Some(error.into()));
     }

Constraints
- Do not change public function signatures or error behaviour.
- Keep all field defaults and invariant enforcement exactly as they are today.
- If existing code panics on out-of-bounds idx, retain that panic message in set_tool_outcome_at for compatibility.
- Run cargo check and the session-related tests. No test changes should be required.

Validation checklist
- rg -n "fn init\\(|set_tool_outcome_at" shows both helpers present.
- cargo check passes.
- Existing tests that use record_tool_result_at/record_tool_error_at still pass.
- No functional diffs in Thread construction logic (same values and timestamps set).

If further refinement is deleterious
The two pairs flagged here are “constructor mirroring” and “dual writer” patterns, which naturally share structure. We have limited the refactor to private helpers that keep public APIs, semantics, and panics unchanged; this is safe and reduces drift. Pushing further (e.g., replacing the public result/error writers with a single Result<Value, String>-based API) would be broader than the diagnostic requires and risks unnecessary churn across call sites. Therefore, stop at the private helpers to satisfy DRY without altering external contracts.

leynos added 2 commits April 15, 2026 14:52
Remove the explicit lifetime from get_or_create_in_cache to satisfy\nclippy::needless_lifetimes without changing the helper's behaviour\nor call sites.
Apply rustfmt to the rollback test after the earlier rstest refactor so\nCI's formatting check matches the branch.
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Remove the duplicated Thread construction and indexed tool-outcome\nmutation logic in src/agent/session.rs while preserving the current\nResult-based public API and behaviour.\n\nThe refactor is covered by the existing session test suite, which now\npasses unchanged.
Copy link
Copy Markdown

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Improved (3 files improve in Code Health)

Gates Failed
Enforce critical code health rules (1 file with Low Cohesion)

Gates Passed
4 Quality Gates Passed

See analysis details in CodeScene

Reason for failure
Enforce critical code health rules Violations Code Health Impact
mod.rs 1 critical rule 5.06 → 6.37 Suppress
View Improvements
File Code Health Impact Categories Improved
job.rs 5.57 → 5.91 Low Cohesion, Code Duplication
thread_ops.rs 5.72 → 9.69 Low Cohesion, Code Duplication, Complex Method, Overall Code Complexity, Excess Number of Function Arguments
mod.rs 5.06 → 6.37 Complex Method, Bumpy Road Ahead, Deep, Nested Complexity, Excess Number of Function Arguments

Absence of Expected Change Pattern

  • axinite/src/db/mod.rs is usually changed with: axinite/src/db/postgres.rs

Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

@leynos leynos merged commit 6322896 into main Apr 15, 2026
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs Issue risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: db Database trait / abstraction scope: dependencies Dependency updates scope: docs Documentation scope: llm LLM integration scope: safety Prompt injection defense scope: worker Container worker size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing tests: SIGHUP scaffolding, worker-orchestrator contract parity, and terminal job-state characterisation

1 participant