feat: /codex skill — multi-AI second opinion + proactive suggestions#197
Merged
feat: /codex skill — multi-AI second opinion + proactive suggestions#197
Conversation
…ult) Three modes: code review with pass/fail gate, adversarial challenge mode, and conversational consult with session continuity. First multi-AI skill in gstack, wrapping OpenAI's Codex CLI.
/review offers Codex second opinion after completing its own review. /ship offers Codex review as optional gate before pushing. /plan-eng-review offers Codex plan critique after scope challenge. Review Readiness Dashboard shows Codex Review as optional row.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stub tests (free tier): verify template content — three modes, gate verdict, session continuity, cost tracking, cross-model comparison, binary discovery, error handling, mktemp usage, and integrations into /review, /ship, /plan-eng-review. E2E test (paid tier): runs /codex review on vulnerable fixture repo via session-runner, verifies output contains findings and GATE verdict.
Codex authenticates via ChatGPT OAuth (codex login), not an env var.
gpt-5.2-codex is the only model available with ChatGPT login. All commands now use model_reasoning_effort="high" for maximum depth — the whole point is a thorough second opinion.
…e) + web search Review and consult use high reasoning — thorough but not slow. Challenge (adversarial) uses xhigh — maximum depth for breaking code. All modes enable web_search_cached so Codex can look up docs/APIs.
Use --json flag to parse codex's JSONL events, extracting reasoning traces ([codex thinking]), tool calls ([codex ran]), and token counts. This gives richer output than the -o flag alone — you can see what codex thought through before its answer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Don't write a codex-review entry to reviews.jsonl when only the adversarial challenge (option B) was selected — there's no gate verdict to record, and a false entry misleads the Review Readiness Dashboard into thinking a code review happened. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After scope challenge (Step 0), offer to have Codex independently review the plan with a brutally honest tech reviewer persona. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ng, stderr - plan-eng-review: Codex now reads the plan file itself instead of inlining content as a CLI arg (avoids ARG_MAX for large plans) - review: add missing echo to persist codex-review results to reviews.jsonl - codex: consult mode uses $TMPERR (mktemp) instead of hardcoded stderr path - codex + review: quote $SLUG/$BRANCH_SLUG in review log paths - codex: scope plan lookup to current project, warn on cross-project fallback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codex consult mode stores session IDs in .context/codex-session-id. Without this ignore rule, session IDs could leak into commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Preamble reads proactive config via gstack-config
- Root SKILL.md.tmpl has lifecycle map (stage → skill suggestion)
- Users can opt out ("stop suggesting") / opt in ("be proactive again")
- Restored trigger phrase validation tests (16 skills × "Use when" check)
- Added missing "Use when" trigger phrases to /debug and /office-hours
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lls) Merged main which added /careful, /freeze, /guard, /unfreeze skills, analytics tracking, proactive suggest phrases, and dirty-tree handling. Resolved conflicts by keeping both sides: codex + new safety skills in template list, deduplicated proactive config in preamble, merged trigger phrase tests with proactive phrase tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/codexskill — three modes:review(diff review with pass/fail gate),challenge(adversarial — tries to break your code), and consult (ask anything with session continuity)/reviewand/codex reviewrun, shows which findings overlap and which are unique to each AI/review,/ship,/plan-eng-review— Codex second opinion offered after Claude's own review, optional gate in ship, plan critique before eng reviewPre-Landing Review
No issues found. All changes are SKILL.md templates, test files, gen-skill-docs.ts, .gitignore, and generated SKILL.md files.
Test plan
🤖 Generated with Claude Code