Agent skills for operating SuperPlane — the open source DevOps control plane for event-driven workflows.
npx skills add superplanehq/skillsOr install a specific skill:
npx skills add superplanehq/skills --skill superplane-cli| Skill | Description |
|---|---|
| superplane-cli | Operate SuperPlane via CLI — auth, canvases, secrets, runs |
| superplane-canvas-builder | Design workflow canvases from requirements |
| superplane-monitor | Debug and inspect workflow executions |
Regression tests for the skills. Each eval spawns a real Claude Code session with a skill loaded, gives it a task, and asserts the bash commands / canvas YAML / response Claude produces.
export ANTHROPIC_API_KEY=sk-ant-...
make evals # all 15 cases
make evals CASES=push_to_slack # one case
make evals SKILL=superplane-cli # all cases for one skill
make evals.list # list case names without runningThat's it. make evals builds the eval image, boots a fresh superplane-demo container alongside it on an internal Docker network, runs the cases, then tears the whole stack down.
EVAL_MODEL=claude-sonnet-4-5 make evals # different model (default: claude-haiku-4-5)
EVAL_REPORT_FORMAT=markdown make evals # markdown table only on stdout
EVAL_REPORT_FORMAT=text make evals # superplane-style text report only
make evals.shell # bash inside the eval container for debugging
make evals.down # nuke leftover stackAfter every run:
evals/reports/<run_id>/report.md— per-case detail + summary table (open in any markdown viewer)evals/reports/<run_id>/summary.json— machine-readable totals + per-case statsevals/reports/<run_id>/<case_name>.json— full per-case detail (bash commands, files written, response text)tmp/evals/<run_id>-NN-<case_name>.log— timestamped event log per case