explore(agent-wiki): example wikis (companion to #268)#269
Draft
vinodmut wants to merge 5 commits into
Draft
Conversation
Adds explorations/agent-wiki/ — the agent-wiki skill family, builder, design
+ schema docs, the wiki-helps experiment reports, and benchmark-derived
example wikis, all under one tree suitable for a public PR.
Contents:
- skills/ 7 agent-wiki skills + build_agent_wiki.py (reference copy,
not plugin-wired)
- docs/ design.md + schema.md
- experiments/ RESULTS-SUMMARY + twobatch comparison reports +
pruned-index-hypothesis; metrics/ rollups (no raw
transcripts); harness/ runner + compare scripts
- wikis/ wiki-terminalbench-bob + the twobatch arms
(base / skills / both / pruned-corrected)
Public-safety scrub:
- Excluded all raw per-trial sandbox transcripts (kept only metric
rollups + narrative reports).
- Excluded wikis built from internal corpora (procedural-design,
consult-meta, iterative, retroactive, simple-claude, test-paired,
claude) and the build-pattern comparison that ran on them; §3-4 of
RESULTS-SUMMARY reduced to a portable-finding note.
- Rewrote all source-path frontmatter to the generic
trajectories/<session-id>.json form; genericized internal example
names and the benchmark-data dir convention in skills/docs.
- Leak gate (benchmark-data / internal corpus + wiki names / org paths)
passes with zero hits across the tree.
Branched off main; diff touches only explorations/agent-wiki/. Builder
catalog + comparison scripts verified runnable from the new location.
Removes the terminal-bench example wiki from the exploration. Repoints the
README reading-order + layout to wiki-twobatch-skills, fixes the docs that
attributed worked examples to it (schema.md now points at the wiki-twobatch
arms; example index rows retagged), and corrects stale relative links the
docs carried from the original tree (../plugin-source → ../skills,
../WIKIS.md removed, ../experiments/wiki-build-comparison.md → RESULTS-SUMMARY
§3–4, design.md/schema.md cross-links to renamed filenames). Skill example
paths (consult, ingest) repointed off the removed wiki.
Remaining wikis: wiki-twobatch {base, skills, both, pruned}. All intra-doc
relative links resolve; leak gate clean.
CI (ruff, mypy, detect-secrets) was scanning explorations/agent-wiki/ as project source — the first content under explorations/ to carry .py files and high-entropy identifiers. Fixes, scoped so generated example artifacts are treated like the already-excluded plugin-source/ and examples/ trees: - ruff: lint + format fixes in the harness scripts + builder; exclude the generated wiki scripts (explorations/agent-wiki/wikis/) via extend-exclude. - mypy: add explorations/agent-wiki/wikis/ to exclude; add file-local `# mypy: ignore-errors` to the exploration harness + the builder (a verbatim copy of the mypy-excluded plugin-source/ original). - detect-secrets: exclude explorations/agent-wiki/ in the pre-commit hook and .secrets.baseline — the 53 findings are 12-hex guideline content hashes and session-id UUIDs, not secrets. No example-wiki content changed (scripts keep their original names). Fixes failing CI checks: check-formatting, check-linting, check-typing, tekton/pr-code-checks/code-detect-secrets.
Drops explorations/agent-wiki/wikis/ (253 generated files, ~10k lines) from this PR so the diff is the reviewable surface — skills, builder, docs, and the experiment reports/harness (~34 files). The example wikis are machine- generated output; bundling them buried the code and appears to have made CodeRabbit skip deep review (summary only, zero inline findings). The wikis land in a stacked follow-up PR. README/docs still reference wikis/wiki-twobatch-* by path; those links resolve once the follow-up merges. Root-config excludes (ruff/mypy/detect-secrets) are kept — the detect-secrets exclude still covers example content hashes in docs/schema.md, and the wiki excludes become live again when the follow-up lands.
The four benchmark-derived example wikis built by the agent-wiki skills:
wiki-twobatch {base, skills, both, pruned}. Generated artifacts — each page
is machine-emitted by build_agent_wiki.py from the trajectories, with
provenance back-links shown in the generic trajectories/<session-id>.json
form. Stacked on the code PR (AgentToolkit#268); resolves the wikis/wiki-twobatch-*
references in that PR's README/docs.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Companion to #268 (the agent-wiki code PR) — merge AFTER #268
Adds the four benchmark-derived example wikis built by the agent-wiki skills, split out from #268 so that PR's diff stays focused on reviewable code (builder, skills, docs, experiment harness) instead of ~10k lines of generated output.
These are generated artifacts — every page is emitted by
build_agent_wiki.pyfrom trajectories, not hand-authored. Provenance back-links appear in the generictrajectories/<session-id>.jsonform. Merging this resolves thewikis/wiki-twobatch-*references in #268's README and docs.