Skip to content

refactor: port finding_verifier.py to Claude Agent SDK native tools#37

Merged
joshbouncesecurity merged 1 commit intomasterfrom
refactor/verifier-sdk-native
Apr 19, 2026
Merged

refactor: port finding_verifier.py to Claude Agent SDK native tools#37
joshbouncesecurity merged 1 commit intomasterfrom
refactor/verifier-sdk-native

Conversation

@joshbouncesecurity
Copy link
Copy Markdown
Owner

Summary

Ports utilities/finding_verifier.py from the anthropic SDK to the Claude Agent SDK. This is Step 3 of the SDK migration tracked in issue #35.

The manual tool-dispatch loop (search_usages, search_definitions, read_function, list_functions, finish) is replaced with a single call to utilities.llm_client.run_native_verification, which runs a multi-turn SDK session with native Read/Grep/Glob/Bash tools and returns structured JSON via VERIFICATION_JSON_SCHEMA. Rate-limit handling is now centralised in utilities.llm_client._run_query (via sdk_errors.RateLimitError and GlobalRateLimiter), so the old except anthropic.RateLimitError blocks are gone.

This reapplies PR #25's Step 3 work, adapted for the current post-merge state (upstream PR #23 re-introduced the anthropic path and added logger/app_context/verbose/checkpoint/workers APIs that must be preserved).

Key changes

libs/openant-core/utilities/finding_verifier.py

  • Drop client: anthropic.Anthropic | None constructor param and the whole while iterations < MAX_ITERATIONS: self.client.messages.create(tools=VERIFICATION_TOOLS, ...) loop.
  • Drop imports of anthropic, ToolExecutor, and the VERIFICATION_TOOLS schema.
  • verify_result now calls run_native_verification with the SDK-aware get_native_claude_verification_prompt and VERIFICATION_JSON_SCHEMA. Process-level SDK failures return a conservative "agree" verdict; rate-limit errors surface as sdk_errors.RateLimitError and are handled by the centralised wrapper.
  • Consistency cross-check (_resolve_inconsistency) uses a dedicated AnthropicClient single-turn call instead of raw client.messages.create.
  • Add output_dir constructor param + _save_explanation helper (already expected by core/verifier.py).
  • VERIFIER_MODEL now pulls from utilities.model_config.MODEL_PRIMARY.
  • Preserve upstream PR feat: auto-detect dependency changes and reinstall openant #23 APIs: verify_batch(..., workers=10, checkpoint=None, restored_callback=None), _verify_batch_sequential/_parallel, _verify_one, _check_consistency, _has_conclusive_exploit_path, _group_by_pattern. verify_result public signature and VerificationResult return type are unchanged.

libs/openant-core/tests/test_local_claude.py

  • Constructor change (client= is gone) means the old TestVerifyWithNativeClaude tests — which mocked mock_client.messages.create — cannot work as written. Rewrote the class to mock utilities.finding_verifier.run_native_verification instead. 6 new tests cover: structured JSON, JSON-in-code-block, free-text verdict fallback, unparseable text, SDK subprocess failure, and missing repo_path.

Files not touched

Per the task brief: pyproject.toml, llm_client.py, agentic_enhancer/agent.py, context_enhancer.py are all untouched.

Test plan

  • python -c "from utilities.finding_verifier import FindingVerifier" imports cleanly.
  • pytest tests/test_local_claude.py — 31/31 passing.
  • pytest tests/test_resume_stage2.py — 15/15 passing (covers verify_batch checkpoint/resume path, consistency cross-check still fires).
  • pytest tests/test_declared_dependencies.py — 8/8 passing (no undeclared imports; anthropic still declared in pyproject.toml and is still imported by other files — that's Steps 4-7's job).
  • grep -n anthropic libs/openant-core/utilities/finding_verifier.py — only the historical comment at line 8 remains; no import.
  • End-to-end smoke: openant verify on a real finding. Deferred to Step 8 of the migration plan, covered by the issue-35 E2E checklist.
  • Live rate-limit: verified centrally in PR feat: surface SDK API errors as typed sdk_errors exceptions #33's _run_query — not re-tested here.

Notes

  • The client constructor keyword is removed. Any external caller that still passes FindingVerifier(client=...) will hit TypeError. All in-tree callers (experiment.py:695, core/verifier.py:138, tests/test_resume_stage2.py) already construct without a client kwarg, so this only affected the TestVerifyWithNativeClaude test class (updated in this PR).
  • PR feat: migrate all LLM calls to Claude Agent SDK #25's batch API (checkpoint_path, concurrency, run_parallel) is NOT restored here — core/verifier.py depends on the upstream workers/checkpoint/restored_callback surface. The upstream ThreadPoolExecutor-based batch code is kept verbatim.

Relates to #35.

Replace the manual anthropic-SDK tool-dispatch loop in FindingVerifier
with a single call to utilities.llm_client.run_native_verification.
The SDK drives a multi-turn session with native Read/Grep/Glob/Bash
tools, uses VERIFICATION_JSON_SCHEMA for structured output, and routes
rate-limit handling through the centralised _run_query wrapper
(sdk_errors.RateLimitError + GlobalRateLimiter notification).

Changes:
- Drop `client` constructor param and the whole while-MAX_ITERATIONS
  self.client.messages.create(tools=VERIFICATION_TOOLS, ...) loop.
- Drop imports of `anthropic`, `ToolExecutor`, and VERIFICATION_TOOLS
  definition (now unused).
- Add `output_dir` constructor param + `_save_explanation` helper
  (already expected by core/verifier.py, previously a no-op).
- Consistency cross-check uses a dedicated single-turn AnthropicClient
  (via utilities.llm_client.AnthropicClient) instead of raw
  client.messages.create.
- VERIFIER_MODEL now comes from utilities.model_config.MODEL_PRIMARY.
- Preserve upstream PR #23 verify_batch API (workers, checkpoint,
  restored_callback) — no change to the verify_result public signature
  or VerificationResult return shape.
- Rate-limit exceptions no longer caught locally: centralised in
  utilities.llm_client._run_query.

Tests:
- Rewrite tests/test_local_claude.py::TestVerifyWithNativeClaude to
  mock utilities.finding_verifier.run_native_verification instead of
  the old anthropic-client messages.create path. 6 new tests cover
  structured JSON, code-block JSON, free-text verdict fallback,
  unparseable text, SDK subprocess failure, and missing repo_path.
- test_resume_stage2.py and test_declared_dependencies.py unchanged
  and passing.

Part of issue #35 / SDK_MIGRATION_COMPLETION_PLAN Step 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joshbouncesecurity joshbouncesecurity merged commit 09073d7 into master Apr 19, 2026
7 checks passed
@joshbouncesecurity joshbouncesecurity deleted the refactor/verifier-sdk-native branch April 19, 2026 11:01
joshbouncesecurity added a commit that referenced this pull request Apr 19, 2026
Final step of the SDK migration tracked in issue #35. Removes
"anthropic>=0.40.0" from pyproject.toml now that no Python code under
libs/openant-core/ still imports it.

Cleanup alongside the dep drop:

- `utilities/context_enhancer.py`: remove the now-orphaned `import anthropic`.
  PR #34 took it out of `_build_error_info`; PR #36 removed `shared_client`.
  The import line was kept alive across both as a staging measure.

- `openant/cli.py` (cmd_report_data): replace the last `anthropic.Anthropic()`
  instantiation — used for the HTML report's remediation-guidance LLM call —
  with `AnthropicClient(model=MODEL_AUXILIARY).analyze_sync(...)`. Usage
  tracking is now automatic via the global TokenTracker; cost display pulls
  from `client.get_last_call()`. Neither PR #36 nor #37 touched this site
  because it was outside their scope; the dep-drift test (PR #30) surfaced
  it when pyproject.toml's dependency list shrank.

- `utilities/rate_limiter.py`: update the module docstring's example. The
  pre-migration example showed `except anthropic.RateLimitError as e:
  retry_after = e.response.headers.get(...)`. That code path no longer
  exists — rate-limit detection is centralised in `llm_client._run_query`,
  which raises `utilities.sdk_errors.RateLimitError` after notifying the
  global limiter. Example updated to match.

Verification:
- `grep -rn '^import anthropic\|^from anthropic' libs/openant-core/` returns
  zero hits.
- `grep -rn 'anthropic\.' libs/openant-core/` returns only a historical
  docstring reference in `sdk_errors.py`.
- `tests/test_declared_dependencies.py` passes — the regression guard from
  PR #30 now enforces that no undeclared imports exist with anthropic gone.
- `tests/test_sdk_errors.py` (12) + `tests/test_sdk_error_surfacing.py` (9)
  all pass.
- `import openant, core, utilities, parsers, prompts, context, report` all
  succeed.

End state: zero `anthropic` Python dep, all LLM traffic routes through the
Claude Agent SDK via `utilities.llm_client`. Step 8 (end-to-end verification
with a live API key) is the only remaining non-user-action item in the plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant