Skip to content

feat: classify plugin SDK impact labels#268

Open
jalehman wants to merge 2 commits into
mainfrom
feat/plugin-sdk-impact-labels
Open

feat: classify plugin SDK impact labels#268
jalehman wants to merge 2 commits into
mainfrom
feat/plugin-sdk-impact-labels

Conversation

@jalehman

@jalehman jalehman commented Jun 8, 2026

Copy link
Copy Markdown

Summary

  • Adds OpenClaw Plugin SDK impact labels and deterministic classification.
  • Syncs one plugin-sdk:* label from current complete reports, preserving higher maintainer/ClawSweeper labels.
  • Publishes a Plugin SDK impact gate Check Run on the target PR head with pass/fail status and maintainer-facing instructions.
  • Emits exact-head/base/path-hash report metadata for the OpenClaw CI gate.

Linked context

Required GitHub App permissions

This PR uses the existing ClawSweeper app and target write token. The target repository installation needs these additional permissions for the new check-run gate:

  • Checks: write so ClawSweeper can create or update the target PR head check run.
  • Members: read so ClawSweeper can verify that an approving reviewer is an active openclaw/maintainer member.

The existing target token permissions for Contents: write, Issues: write, and Pull requests: write remain required for the apply lane’s existing report, comment, and label mutations.

Verification

  • pnpm run format
  • pnpm run build:all
  • node --test --test-name-pattern "Plugin SDK impact|sweep target write tokens|apply-decisions publishes Plugin SDK|label sync loses authentication" test/*.test.ts test/repair/*.test.ts dist/repair/*.test.js
  • pnpm run check
  • autoreview --mode branch --base origin/main

@clawsweeper

clawsweeper Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 7, 2026, 10:44 PM ET / 02:44 UTC.

Summary
The branch adds Plugin SDK impact classification, report metadata and markers, apply-time label/check-run sync, workflow app permissions for Checks/Members, and unit coverage.

Reproducibility: yes. for the review finding: source inspection shows mergedOpenClawRfcPulls calls ghJson for each RFC link without catching lookup failures, and the apply sync wrapper only handles authentication errors. Real behavior proof for the full PR remains absent.

Review metrics: 2 noteworthy metrics.

  • Patch surface: 3 files changed, +1,563/-20. The PR changes core automation, workflow permissions, and tests, so review should cover both code behavior and deployment effects.
  • Target token permissions: 2 permissions added in 3 token blocks. Checks write and Members read require GitHub App installation readiness before the new gate can work in apply.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Catch missing/inaccessible RFC PR lookups and add a regression test that the Plugin SDK gate blocks without aborting apply.
  • [P1] Add redacted terminal output, logs, or a linked artifact showing the branch produces the Plugin SDK metadata/marker and target check run in a real review/apply flow.
  • Coordinate maintainer acceptance of the new plugin-sdk taxonomy, Checks/Members app permissions, and the linked OpenClaw gate.

Proof guidance:

  • [P1] Needs real behavior proof before merge: No after-fix real behavior proof is attached; the PR body lists commands but not observed output or artifacts from autoreview/apply showing the Plugin SDK metadata, label, and check-run path. Redact private data before posting proof; updating the PR body should trigger a fresh ClawSweeper review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] A malformed, deleted, or inaccessible openclaw/rfcs link can make the new Plugin SDK impact check sync throw during apply instead of leaving the gate blocked for maintainer action.
  • [P1] The target repository GitHub App installation must have Checks: write and Members: read before the new gate can operate, so merge requires operator readiness beyond normal CI.
  • [P1] The linked companion OpenClaw gate at ci: add plugin SDK impact gate openclaw#91290 is still open, so the producer and consumer sides should be reviewed together before relying on this contract.
  • [P1] The PR still needs redacted real behavior proof showing the branch behavior after the fix, not only command names.

Maintainer options:

  1. Fail closed on bad RFC links (recommended)
    Catch missing or inaccessible openclaw/rfcs lookups and treat them as unsatisfied RFC evidence, with a regression test that proves apply continues and publishes a blocked gate.
  2. Accept the app-permission rollout
    Maintainers can intentionally merge the Checks/Members permission expansion after confirming the OpenClaw installation and companion gate are ready.
  3. Pause until the companion lands
    If the OpenClaw-side gate is not ready, keep this PR open until ci: add plugin SDK impact gate openclaw#91290 is reviewed or replaced.

Next step before merge

  • [P1] Human review is needed because the PR lacks real behavior proof and changes a maintainer-visible taxonomy/app-permission contract; the RFC lookup defect should be fixed before merge.

Security
Cleared: No unrelated supply-chain or secret-handling change was found; the intentional Checks/Members permission expansion is merge-risk tracked for maintainer approval.

Review findings

  • [P1] Fail closed when RFC links cannot be resolved — src/clawsweeper.ts:14264-14267
Review details

Best possible solution:

Land the Plugin SDK gate after RFC lookups fail closed, the companion OpenClaw gate and app permissions are accepted, and redacted runtime proof shows the label/check metadata working on a real branch review.

Do we have a high-confidence way to reproduce the issue?

Yes for the review finding: source inspection shows mergedOpenClawRfcPulls calls ghJson for each RFC link without catching lookup failures, and the apply sync wrapper only handles authentication errors. Real behavior proof for the full PR remains absent.

Is this the best way to solve the issue?

No: the taxonomy and check-run direction may be reasonable, but the implementation should treat invalid RFC links as an unsatisfied gate instead of an apply failure. It also needs maintainer acceptance of the permission/taxonomy contract and redacted runtime proof before merge.

Full review comments:

  • [P1] Fail closed when RFC links cannot be resolved — src/clawsweeper.ts:14264-14267
    For breaking or architecture classifications, this code dereferences every openclaw/rfcs link with ghJson. If the PR body contains a mistyped, deleted, or inaccessible RFC PR, that 404 will throw; the apply wrapper around syncPluginSdkImpactCheckRun only handles auth errors, so one bad link can abort the apply run instead of publishing a blocked Plugin SDK gate. Catch lookup failures and treat that RFC as unsatisfied, with a regression test.
    Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.87

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 413ef258a1ef.

Label changes

Label changes:

  • add merge-risk: 🚨 compatibility: The workflow now requires additional target GitHub App permissions, so existing installations may need operator action before the gate works.

Label justifications:

  • P2: This is a normal-priority automation feature with a concrete apply-lane defect and limited repository blast radius.
  • merge-risk: 🚨 automation: The diff changes label sync, check-run publishing, and apply-lane behavior that can break ClawSweeper automation after merge.
  • merge-risk: 🚨 compatibility: The workflow now requires additional target GitHub App permissions, so existing installations may need operator action before the gate works.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦐 gold shrimp.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: No after-fix real behavior proof is attached; the PR body lists commands but not observed output or artifacts from autoreview/apply showing the Plugin SDK metadata, label, and check-run path. Redact private data before posting proof; updating the PR body should trigger a fresh ClawSweeper review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • AGENTS policy: AGENTS.md was read fully; its automation-safety guidance applies because this PR changes both src/clawsweeper.ts and .github/workflows/sweep.yml. (AGENTS.md:1, 413ef258a1ef)
  • RFC lookup can throw during apply: The PR branch dereferences every openclaw/rfcs PR number through ghJson inside mergedOpenClawRfcPulls without catching 404 or access failures. (src/clawsweeper.ts:14264, 8b117f854c8c)
  • Apply wrapper only handles auth errors: The new check-run sync is called inside the existing label-sync try block, which rethrows non-authentication errors; a bad RFC lookup would abort the apply path rather than publish a blocked gate. (src/clawsweeper.ts:16767, 8b117f854c8c)
  • Permission surface changed: The workflow adds permission-checks: write and permission-members: read to all three target write token blocks. (.github/workflows/sweep.yml:237, 8b117f854c8c)
  • Tests cover happy paths, not bad RFC references: The added tests cover publishing a successful Plugin SDK impact check, taxonomy labels, classification, and token permission assertions, but the inspected test slice does not include a nonexistent or inaccessible openclaw/rfcs link case. (test/clawsweeper.test.ts:6007, 8b117f854c8c)
  • Proof still missing: The PR body lists verification commands but does not include observed terminal output, logs, screenshots, recordings, or linked artifacts for the new runtime check/label behavior. (8b117f854c8c)

Likely related people:

  • Tak Hoffman: Blame shows commit 1ff140f introduced the current apply command, label sync helpers, and target write token workflow blocks that this PR extends. (role: introduced behavior; confidence: high; commits: 1ff140ff45c6; files: src/clawsweeper.ts, .github/workflows/sweep.yml)
  • Jesse Merhi: Blame shows commit 2af278e recently modified PR label-sync state in the same apply path that this PR extends for Plugin SDK impact labels and checks. (role: recent adjacent contributor; confidence: medium; commits: 2af278ebe246; files: src/clawsweeper.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 8, 2026
@clawsweeper clawsweeper Bot added P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 8, 2026
@jalehman jalehman requested a review from a team as a code owner June 8, 2026 02:38
@clawsweeper clawsweeper Bot added the merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant