bstack P20 — Cross-Model Adversarial Review Gate. The model that wrote the code cannot be the final judge of the code. Substantive PRs fire an adversarial review (different evaluator, anti-slop ≥7/10) before merge.
npx skills add broomva/cross-reviewBefore pushing a substantive PR (>200 LOC OR public API OR multi-file OR governance-class):
cross-review pre-push --diff-base origin/mainThe script auto-detects Codex CLI. If installed → fires Strata A (true cross-vendor). Otherwise → Strata B (fresh subagent). Strata C (composed adversarial-review skills) always runs in parallel.
When the same AI model plans, implements, and reviews, it will not challenge its own assumptions. A different model — trained differently, with different biases — catches what the first one misses. This pattern was systematized by Dallionking/cross-model-agents (May 2026) — 31-agent bidirectional Claude↔Codex review system with anti-slop scoring.
cross-review brings the discipline (cross-model gate is mandatory for substantive PRs) into bstack as primitive P20, while keeping our composition pattern (3 strata so the primitive works whether or not Codex is installed).
| Strata | Mechanism | Strength |
|---|---|---|
| A | codex exec -m gpt-5.4 reads diff and scores |
Strongest — genuinely different model |
| B | Fresh Agent subagent under devil's-advocate brief |
Strong — fresh context breaks within-session echo |
| C | Composed existing adversarial-review skills (superpowers:constructive-dissent, devils-advocate, pr-review-toolkit:*, critique, premortem) |
Always — toolkit P20 makes mandatory |
Anti-slop scoring across 5 dimensions (2 pts each):
- No over-engineered abstractions
- No template-paste patterns
- Correct contracts at boundaries
- Failure modes named explicitly
- Tests cover the change
PASS at ≥7/10. LOOP if <7 (max 3 fix rounds). ESCALATE on round 3 failure.
Full rubric: references/rubric.md.
| Primitive | Role |
|---|---|
| P4 PR Pipeline | P20 fires before P4 auto-merge |
| P7 CI Watcher | After P20 passes, P7 watches CI |
| P11 Empirical Feedback | Different dim: P11 "does it run", P20 "is it well-built" |
| P17 Lens-Routed Articulation | Lenses become evaluator stances |
| P18 Format-Follows-Audience | Verdict logged as PR comment (markdown for both audiences) |
| P19 Mechanism Selection | P20-gated PRs naturally run as a /goal arc |
SKILL.md— full skill contract (3 strata, rubric, reflexive trigger, anti-rationalizations)references/rubric.md— the anti-slop scoring rubric + adversarial brief templatesscripts/cross-review.sh— the entry point (auto-detects strata, structures the gate)tests/— verification battery (pressure scenarios + integration tests)
MIT — see LICENSE.