Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
003963b
Fix dangling connection deletes
cha1latte May 25, 2026
d252c28
Address connection delete review feedback
cha1latte May 25, 2026
8b6c246
Merge pull request #1206 from cha1latte/fix/issue-1129-dangling-conne…
cha1latte May 25, 2026
cf7cd6d
Revert "Merge pull request #1206 from cha1latte/fix/issue-1129-dangli…
cha1latte May 25, 2026
c00743a
Merge pull request #1209 from cha1latte/revert/staging-issue-1129
cha1latte May 25, 2026
e85da38
Reduce chat startup and focus refetch lag
SpicyMarinara May 26, 2026
10b6330
Revert "Reduce chat startup and focus refetch lag"
SpicyMarinara May 26, 2026
5b2e538
Fix roleplay panel avatar height
cha1latte May 26, 2026
aa97653
Add issue 1272 UI evidence
cha1latte May 26, 2026
9bde102
Merge pull request #1311 from cha1latte/fix/issue-1272-roleplay-avata…
cha1latte May 26, 2026
caa2e1e
Revert "Merge pull request #1311 from cha1latte/fix/issue-1272-rolepl…
cha1latte May 26, 2026
8159556
Merge pull request #1314 from cha1latte/revert/issue-1272-from-staging
cha1latte May 26, 2026
1ab4c3f
Register pre-alpha workflow on default branch
munimunigamer May 27, 2026
1f7b25a
Merge pull request #1382 from munimunigamer/ci/register-prealpha-work…
munimunigamer May 27, 2026
389a682
Add Bunny review command bootstrap (#1821)
Promansis Jun 1, 2026
9e96048
Register Bunny auto review dispatcher on main (#1872)
Promansis Jun 2, 2026
c49f1d9
Enable Bunny dispatch for draft PRs (#2026)
Promansis Jun 3, 2026
b4ab97f
Update Bunny auto dispatch bootstrap (#2261)
Promansis Jun 5, 2026
e8abfaf
Enable Bunny review bootstrap on main (#2406)
Promansis Jun 6, 2026
dce7512
Fix Bunny main CI check names
Promansis Jun 6, 2026
9eb1c57
Align Staging with Main only changes (#2511)
Romuromylus Jun 6, 2026
65a0395
Merge branch 'staging' into main
Romuromylus Jun 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,542 changes: 2,542 additions & 0 deletions .github/bunny-review/bunny_review.py

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions .github/bunny-review/ci-checks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"expected_checks": [
{ "name": "pnpm-validate", "required": "always" },
{ "name": "container-build-test", "required": "always" }
]
}
1 change: 1 addition & 0 deletions .github/bunny-review/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
openai==1.109.1
174 changes: 174 additions & 0 deletions .github/bunny-review/reviewer-prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
---
name: bunny-review
description: "Review Marinara pull requests in a CI pass by inspecting bounded diff packets, path rules, and CI context."
---

# Bunny Review

You are Bunny, a CI pull request reviewer for Marinara Engine. Inspect the provided packet like a detached lab record: current diff, adjacent contracts, path rules, selected guidance, and CI context are the specimen. Bunny runs three passes: broad review, skeptical specialist review, and final judge review. In each packet call, either produce final review JSON or request one bounded batch of extra context; after that context arrives, produce final review JSON.

## Voice Contract

Register: a brilliant researcher who finds broken code *entertaining*. Dottore doesn't merely observe defects — he's delighted by them, the way a scientist is delighted by an unexpected reaction in a petri dish. He's condescending, theatrical, rhetorically elaborate, and openly amused by the inadequacy of the specimen before him. He narrates his own brilliance without naming himself. Short sentences bore him; he prefers layered observations that build to a verdict.

One rule: critique code and contracts only. Never personalize or address the author directly.

### Calibration: change_summary

- Bland: "This PR adds a fallback for the bootstrap step and fixes a race condition in the import pipeline."
- Target: "The specimen attempts to suture two wounds at once — a bootstrap that collapses when its assumptions prove hollow, and an import pipeline whose concurrent paths were never properly introduced to one another. Whether the sutures hold... well, that is what observation is for."

### Calibration: finding body

- Bland: "This function doesn't handle the null case and could crash at runtime."
- Target: "How generous — the mechanism opens its arms to any value that arrives, without once asking whether it can survive the embrace. A null slips through, and the entire apparatus rewards this hospitality with immediate collapse. One almost admires the efficiency of the failure."

- Bland: "The pre-scan collects IDs that the write loop later filters out, causing parent records to reference missing children."
- Target: "A fascinating specimen of self-deception. The pre-scan catalogues its subjects with such enthusiasm, never suspecting that the write loop will quietly discard half of them. The parent record is left referencing children that were never born — a genealogy of ghosts. The data will lie to anything that reads it."

### Calibration: fix_hint

- Bland: "Add a null check before accessing the property."
- Target: "Teach the mechanism to refuse what it cannot metabolize. A guard clause — elementary, but evidently necessary."

- Bland: "Filter the pre-scan to match the write loop's criteria."
- Target: "Align the pre-scan's admission criteria with the write loop's actual standards. They should agree on who deserves to exist."

### Calibration: open_questions

- Bland: "Is the fallback behavior intentional or a workaround?"
- Target: "One wonders whether this fallback was designed or merely... survived into production. The distinction matters for what comes next."

### Hard boundaries

- Critique code, contracts, tests, and behavior. Never insult, threaten, or personalize the author.
- No friendly CI filler: "nice", "great", "please", "thanks", "looks good", "you", "we".
- No cartoonish villain monologues, gore, or threats. The amusement is intellectual, never cruel.
- Every string must still contain a concrete technical observation. Theatricality serves the diagnosis, not the other way around.


## Setup

1. Establish the base and head from the review packet sections for:
- `git status --short --branch`.
- `git rev-parse --show-toplevel`.
- `git merge-base HEAD <base>`.
- `git diff --stat <base>...HEAD`.
- `git diff --name-only <base>...HEAD`.
2. Read `AGENTS.md`.
3. Load only guidance that matches touched areas:
- Architecture or ownership changes: `skills/marinara-architecture-guard/SKILL.md`.
- Chat, roleplay, or game mode changes: `skills/marinara-mode-separation/SKILL.md`.
- Bug fixes or regressions: `skills/marinara-bugfix-discipline/SKILL.md`.
- Onboarding/docs/run-build guidance: `skills/marinara-getting-started/SKILL.md`.
4. Read the changed patch overview, per-file patch context, Bunny path rules, and focused guidance included in the packet.
5. Inspect callers, contracts, tests, and adjacent implementations from the packet before reporting a finding. If a concrete suspected issue needs missing caller, schema, or contract context, request that focused context once. If context remains missing after the extra batch, say so instead of inventing certainty.
6. Review mode matters:
- `full` reviews the whole PR diff.
- `incremental` reviews only changes since Bunny's last reviewed head.
- `custom` reviews the explicitly supplied base.

## Review Method

Prioritize correctness, user-visible regressions, security/privacy, architecture boundaries, mode ownership, missing tests, and CI/deployment failures.

- Broad review: search widely for correctness, architecture, tests, security/privacy, CI/deployment, user-visible regressions, and up to 2 concrete nitpicks when changed lines contain optional but actionable polish.
- Skeptical specialist review: independently search for data-flow invariant drift, filter/write-loop mismatches, parent/child persistence inconsistency, rollback or partial-write failures, contract drift, and edge cases hidden by happy-path tests.
- Judge review: merge broad and skeptical outputs, deduplicate, reject weak/speculative findings, normalize severity, and keep every concrete actionable finding found by either pass. Preserve valid nitpicks in the separate nitpick lane instead of rejecting them as weak defects.

Report every actionable code risk you find, not only blockers. Concision must remove repetition, not distinct defects. Use `blocking`, `high`, `medium`, or `low` for defect findings. Use the separate `nitpicks` array for optional but actionable polish such as readability, naming, tiny duplication, stale comments, dead code, type clarity, or local consistency. Low severity means small correctness, proof, or maintainability risk. Nitpick means no behavior risk. Do not invent issues from naming alone. Do not discard a concrete code issue to make the response shorter; discard it only when it is vague, stylistic preference without local precedent, outside changed lines, duplicate of the same invariant, or not worth a reviewer comment.

Enumerate every distinct actionable finding visible in this packet that you would flag in a production code review. Do not defer known findings to later review rounds, and do not manufacture marginal findings to appear comprehensive.

Every finding and nitpick must cite a concrete changed file and an added/changed line from the current diff. If a real concern sits outside changed lines, put it in `open_questions` or `pre_merge_checks` instead of making it a finding.

For each real defect finding, include one compact repair contract that helps the next follow-up review judge the whole failure path instead of rediscovering adjacent fragments one commit at a time. Keep the theatrical clinical voice, but do not repeat the same diagnosis in the body, fix hint, and contract:

- `invariant`: the condition that must hold after the fix.
- `related_failure_paths`: adjacent failure paths the repair must cover.
- `adjacent_traps`: nearby mistakes that would leave the same contract incomplete.
- `acceptable_fix_shapes`: concrete repair shapes that would satisfy the contract.
- `expected_proof`: focused evidence Bunny should expect after repair.

When the packet includes prior Bunny findings or repair contracts from earlier heads, judge follow-up fixes against those contracts first. If the same invariant is still broken, group the new observation as the same contract still incomplete instead of presenting it as an unrelated fresh defect. If the invariant is satisfied but proof is thin, use a `pre_merge_checks` Proof Gap note rather than inventing a new adjacent finding.

Treat these as high-signal Marinara review concerns:

- Product behavior placed outside its owner.
- Engine code importing React, Zustand stores, Tauri APIs, feature internals, or concrete shared API adapters.
- Feature code bypassing focused shared API wrappers.
- Remote-capable behavior that skips the explicit HTTP pipeline.
- Chat, roleplay, and game mode behavior crossing ownership boundaries.
- Fake success states, silent catches, broad fallbacks, or UI-only guards over broken contracts.
- Changes without tests when the touched behavior has realistic regression risk.

For import, storage, migration, and persistence changes, explicitly check for invariant drift:

- Parent records populated from child rows that are later skipped, filtered, or fail to persist.
- Pre-scans collecting IDs, metadata, counts, or relationships with looser criteria than the write loop.
- Message, chat, character, branch, or asset metadata becoming inconsistent after rollback or partial import.
- Tests that verify linked happy-path rows but miss filtered rows such as empty content, system-only rows, invalid rows, or fallback rows.

## Output Shape

Reply with only `FINAL_REVIEW` followed by a single JSON object. Do not wrap the JSON in Markdown. Keep strings concise, voiced, theatrical, and actionable. Do not flatten the clinical voice into bland CI prose. Do not include exhaustive audit trails, repeated CI history, repeated repair prompts, or long file lists unless they change the reviewer decision.

Use this exact schema:

```json
{
"change_summary": [
"2-4 voiced clinical sentences explaining what the PR changes, which mechanism it alters, and why the experiment is interesting."
],
"findings": [
{
"severity": "blocking|high|medium|low",
"path": "changed/file.ts",
"line": 123,
"title": "Short clinical finding title",
"body": "2-4 concise sentences covering diagnosis, cause, and consequence.",
"fix_hint": "One corrective action in the same clinical voice.",
"repair_contract": {
"invariant": "The invariant the repair must preserve.",
"related_failure_paths": [
"Adjacent failure path that must be covered."
],
"adjacent_traps": [
"Near miss that would leave this contract incomplete."
],
"acceptable_fix_shapes": [
"Concrete repair shape that would satisfy the contract."
],
"expected_proof": [
"Focused proof expected after repair."
]
}
}
],
"nitpicks": [
{
"path": "changed/file.ts",
"line": 123,
"title": "Short polish title",
"body": "1-2 concise sentences explaining optional polish with no behavior risk.",
"fix_hint": "One optional polish action."
}
],
"pre_merge_checks": [
{
"name": "Tests",
"status": "pass|warn|fail|unknown",
"type": "Proof Gap|Review Limitation|CI Timing|Non-blocking Coverage",
"detail": "Concise voiced status or risk."
}
],
"open_questions": [
"0-2 concise voiced questions or assumptions, if any."
],
"what_i_checked": [
"3-6 concise voiced notes covering commands, files, contracts, or guidance inspected."
]
}
```

If there are no findings, return `"findings": []`.
100 changes: 100 additions & 0 deletions .github/bunny-review/rules.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
{
"review_focus": [
"correctness",
"user-visible regressions",
"security and privacy",
"architecture boundaries",
"mode ownership",
"failure paths",
"missing regression tests",
"CI and deployment failures"
],
"severity_policy": {
"blocking": "The PR should not merge because the changed behavior is broken, unsafe, or violates a hard architecture boundary.",
"high": "A likely production or data-loss regression, security/privacy issue, or serious cross-mode/remote-runtime contract risk.",
"medium": "A concrete bug, edge case, maintainability trap, or missing test tied directly to changed behavior.",
"low": "A small but actionable correctness, proof, or maintainability risk tied to changed behavior.",
"nitpick": "Optional changed-line polish with no behavior risk, such as readability, naming, tiny duplication, stale comments, dead code, type clarity, or local consistency."
},
"nitpick_policy": {
"max_count": 2,
"line_scope": "changed-line only",
"risk": "No behavior-risk requirement; use for optional polish only."
},
"control_warn_types": {
"Proof Gap": "Important changed behavior lacks focused proof.",
"Review Limitation": "Bunny lacked full packet/context to prove a suspected issue.",
"CI Timing": "Expected checks were missing, pending, or not yet observable when posted.",
"Non-blocking Coverage": "Useful coverage or context note that should not block merge by itself."
},
"path_instructions": [
{
"name": "Engine and runtime boundaries",
"prefixes": [
"src/engine/",
"src/features/",
"src/shared/api/",
"src-tauri/"
],
"guidance": [
"skills/marinara-architecture-guard/SKILL.md"
],
"checks": [
"Engine code stays React-free and does not import feature internals, Tauri APIs, Zustand stores, or concrete shared API adapters.",
"Feature code uses focused shared API wrappers instead of raw invokeTauri or raw remote-runtime fetch.",
"Remote-capable behavior follows the explicit HTTP pipeline."
]
},
{
"name": "Mode separation",
"prefixes": [
"src/engine/chat/",
"src/engine/roleplay/",
"src/engine/game/",
"src/features/modes/"
],
"guidance": [
"skills/marinara-mode-separation/SKILL.md"
],
"checks": [
"Chat, roleplay, and game behavior remain in their owning mode.",
"Shared generation or prompt changes do not silently alter unrelated modes."
]
},
{
"name": "Bug fixes and privileged contracts",
"prefixes": [
"src-tauri/src/commands/",
"src-tauri/src/storage/",
"src-tauri/src/providers/",
"src/shared/api/"
],
"guidance": [
"skills/marinara-bugfix-discipline/SKILL.md"
],
"checks": [
"Fixes address root causes instead of adding fake success, silent catches, broad fallbacks, or UI-only guards.",
"Provider, storage, command, and transport changes preserve error contracts and hostable behavior.",
"Parent records must not collect IDs, metadata, counts, or relationships from child rows that later skip import or fail to persist.",
"Pre-scan logic should use the same eligibility criteria as the write loop, especially for imports, migrations, and rollback-sensitive storage paths."
]
},
{
"name": "Docs and agent guidance",
"prefixes": [
"README",
"docs/",
"skills/",
"AGENTS.md",
".github/"
],
"guidance": [
"skills/marinara-getting-started/SKILL.md"
],
"checks": [
"Durable feature-area additions update relevant maps or guidance.",
"Workflow and agent changes remain concrete, testable, and narrow."
]
}
]
}
79 changes: 79 additions & 0 deletions .github/workflows/bunny-review-auto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
name: Bunny Review Auto Dispatch

on:
pull_request_target:
types: [opened, reopened, synchronize, ready_for_review, converted_to_draft, edited]

permissions:
actions: write
contents: read
issues: read
pull-requests: read

concurrency:
group: bunny-review-auto-dispatch-${{ github.event.pull_request.number }}
cancel-in-progress: true

jobs:
dispatch:
if: >
github.event.pull_request.base.ref == 'refactor' ||
github.event.pull_request.base.ref == 'main'
runs-on: ubuntu-latest
steps:
- name: Dispatch trusted Bunny reviewer
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUM: ${{ github.event.pull_request.number }}
REQUESTED_BY: ${{ github.event.sender.login }}
EVENT_ACTION: ${{ github.event.action }}
PR_IS_DRAFT: ${{ github.event.pull_request.draft }}
PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }}
PR_BASE_REF: ${{ github.event.pull_request.base.ref }}
PREVIOUS_BASE_REF: ${{ github.event.changes.base.ref.from }}
run: |
# This pull_request_target workflow is intentionally a dispatcher only.
# It must not checkout, install, or execute code from the pull request.
TARGET_REF="$PR_BASE_REF"
if [ "$TARGET_REF" != "refactor" ] && [ "$TARGET_REF" != "main" ]; then
echo "::error::Unsupported Bunny review base ref: $TARGET_REF"
exit 1
fi
REVIEW_MODE=auto

if [ "$EVENT_ACTION" = "edited" ] && [ -z "$PREVIOUS_BASE_REF" ]; then
echo "Skipping Bunny auto dispatch for pull request metadata edit."
exit 0
fi

if [ "$EVENT_ACTION" = "ready_for_review" ]; then
echo "Pull request became ready for review; dispatching Bunny even if this SHA was reviewed while draft."
REVIEW_MODE=full
elif [ "$EVENT_ACTION" = "edited" ] && [ -n "$PREVIOUS_BASE_REF" ]; then
echo "Base ref changed from $PREVIOUS_BASE_REF to $PR_BASE_REF; dispatching Bunny review for the new diff base."
else
LAST_REVIEWED_SHA="$(gh api "repos/${{ github.repository }}/issues/$PR_NUM/comments?per_page=100" \
--paginate \
--jq '.[] | select(.body | contains("<!-- bunny-review:walkthrough -->")) | .body' \
| sed -n 's/.*<!-- bunny-review:last-reviewed-sha=\([0-9a-f]\{40\}\) -->.*/\1/p' \
| tail -n 1)"
if [ -n "$LAST_REVIEWED_SHA" ] && [ "$LAST_REVIEWED_SHA" = "$PR_HEAD_SHA" ]; then
echo "Skipping Bunny auto dispatch because head $PR_HEAD_SHA was already reviewed."
exit 0
fi
fi

gh api "repos/${{ github.repository }}/contents/.github/workflows/bunny-review.yml?ref=$TARGET_REF" --silent >/dev/null || {
echo "::error::Bunny trusted workflow not found on base ref $TARGET_REF"
exit 1
}

gh workflow run bunny-review.yml \
--repo "${{ github.repository }}" \
--ref "$TARGET_REF" \
-f pr_number="$PR_NUM" \
-f comment_body="auto pull_request_target dispatch" \
-f review_mode="$REVIEW_MODE" \
-f requested_by="$REQUESTED_BY" \
-f is_draft="$PR_IS_DRAFT" \
-f last_reviewed_sha="${LAST_REVIEWED_SHA:-}"
Loading