feat: /codex skill — multi-AI second opinion + proactive suggestions by garrytan · Pull Request #197 · garrytan/gstack

garrytan · 2026-03-19T04:12:17Z

Summary

/codex skill — three modes: review (diff review with pass/fail gate), challenge (adversarial — tries to break your code), and consult (ask anything with session continuity)
Cross-model analysis — when both /review and /codex review run, shows which findings overlap and which are unique to each AI
Integrated into /review, /ship, /plan-eng-review — Codex second opinion offered after Claude's own review, optional gate in ship, plan critique before eng review
Proactive skill suggestions — gstack notices your development stage and suggests the right skill; opt out with "stop suggesting"
Trigger phrase validation tests — ensures all skills have "Use when" and "Proactively suggest" phrases for reliable NLP routing
Bug fixes from Codex adversarial challenge: scoped plan lookup (cross-project leak), mktemp for stderr (race condition), quoted path variables, .context/ gitignored (session ID leak), ARG_MAX-safe plan review

Pre-Landing Review

No issues found. All changes are SKILL.md templates, test files, gen-skill-docs.ts, .gitignore, and generated SKILL.md files.

Test plan

All unit/validation tests pass (bun test — 0 failures)
Codex review PASS (3 P2 findings, all fixed)
Codex adversarial challenge run (4 critical/high, 6 medium — all addressed)
Merge conflicts with main resolved (careful/freeze/guard/unfreeze skills)
gen:skill-docs regenerates all 22 SKILL.md files successfully

🤖 Generated with Claude Code

…ult) Three modes: code review with pass/fail gate, adversarial challenge mode, and conversational consult with session continuity. First multi-AI skill in gstack, wrapping OpenAI's Codex CLI.

/review offers Codex second opinion after completing its own review. /ship offers Codex review as optional gate before pushing. /plan-eng-review offers Codex plan critique after scope challenge. Review Readiness Dashboard shows Codex Review as optional row.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Stub tests (free tier): verify template content — three modes, gate verdict, session continuity, cost tracking, cross-model comparison, binary discovery, error handling, mktemp usage, and integrations into /review, /ship, /plan-eng-review. E2E test (paid tier): runs /codex review on vulnerable fixture repo via session-runner, verifies output contains findings and GATE verdict.

Codex authenticates via ChatGPT OAuth (codex login), not an env var.

gpt-5.2-codex is the only model available with ChatGPT login. All commands now use model_reasoning_effort="high" for maximum depth — the whole point is a thorough second opinion.

…e) + web search Review and consult use high reasoning — thorough but not slow. Challenge (adversarial) uses xhigh — maximum depth for breaking code. All modes enable web_search_cached so Codex can look up docs/APIs.

Use --json flag to parse codex's JSONL events, extracting reasoning traces ([codex thinking]), tool calls ([codex ran]), and token counts. This gives richer output than the -o flag alone — you can see what codex thought through before its answer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Don't write a codex-review entry to reviews.jsonl when only the adversarial challenge (option B) was selected — there's no gate verdict to record, and a false entry misleads the Review Readiness Dashboard into thinking a code review happened. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

After scope challenge (Step 0), offer to have Codex independently review the plan with a brutally honest tech reviewer persona. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ng, stderr - plan-eng-review: Codex now reads the plan file itself instead of inlining content as a CLI arg (avoids ARG_MAX for large plans) - review: add missing echo to persist codex-review results to reviews.jsonl - codex: consult mode uses $TMPERR (mktemp) instead of hardcoded stderr path - codex + review: quote $SLUG/$BRANCH_SLUG in review log paths - codex: scope plan lookup to current project, warn on cross-project fallback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Codex consult mode stores session IDs in .context/codex-session-id. Without this ignore rule, session IDs could leak into commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Preamble reads proactive config via gstack-config - Root SKILL.md.tmpl has lifecycle map (stage → skill suggestion) - Users can opt out ("stop suggesting") / opt in ("be proactive again") - Restored trigger phrase validation tests (16 skills × "Use when" check) - Added missing "Use when" trigger phrases to /debug and /office-hours Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…lls) Merged main which added /careful, /freeze, /guard, /unfreeze skills, analytics tracking, proactive suggest phrases, and dirty-tree handling. Resolved conflicts by keeping both sides: codex + new safety skills in template list, deduplicated proactive config in preamble, merged trigger phrase tests with proactive phrase tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan and others added 18 commits March 18, 2026 21:11

feat: /codex skill — multi-AI second opinion (review, challenge, cons…

311d842

…ult) Three modes: code review with pass/fail gate, adversarial challenge mode, and conversational consult with session continuity. First multi-AI skill in gstack, wrapping OpenAI's Codex CLI.

chore: bump version and changelog (v0.8.0)

d5e6dd3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: codex auth error message — use codex login, not OPENAI_API_KEY

0b009d2

Codex authenticates via ChatGPT OAuth (codex login), not an env var.

feat: codex uses high reasoning effort by default

4e7e5de

gpt-5.2-codex is the only model available with ChatGPT login. All commands now use model_reasoning_effort="high" for maximum depth — the whole point is a thorough second opinion.

feat: crank codex reasoning to xhigh (maximum)

4c60be7

refactor: don't hardcode model — use codex default (always latest)

5ec2dd0

feat: add codex plan review option to /plan-eng-review

22b75ff

After scope challenge (Step 0), offer to have Codex independently review the plan with a brutally honest tech reviewer persona. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: update e2e test for codex skill

609daf8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add .context/ to .gitignore to prevent session ID leaks

34e4047

Codex consult mode stores session IDs in .context/codex-session-id. Without this ignore rule, session IDs could leak into commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: update changelog for v0.8.0 — add proactive suggestions note

c4c0a58

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan changed the title ~~feat: /codex skill — multi-AI second opinion platform (v0.8.0)~~ feat: /codex skill — multi-AI second opinion + proactive suggestions Mar 19, 2026

garrytan merged commit d852330 into main Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: /codex skill — multi-AI second opinion + proactive suggestions#197

feat: /codex skill — multi-AI second opinion + proactive suggestions#197
garrytan merged 18 commits intomainfrom
garrytan/codex-review-skill

garrytan commented Mar 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Pre-Landing Review

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented Mar 19, 2026 •

edited

Loading