explore(agent-wiki): trajectory-derived wiki — skills, builder, experiments#268
explore(agent-wiki): trajectory-derived wiki — skills, builder, experiments#268vinodmut wants to merge 4 commits into
Conversation
Adds explorations/agent-wiki/ — the agent-wiki skill family, builder, design
+ schema docs, the wiki-helps experiment reports, and benchmark-derived
example wikis, all under one tree suitable for a public PR.
Contents:
- skills/ 7 agent-wiki skills + build_agent_wiki.py (reference copy,
not plugin-wired)
- docs/ design.md + schema.md
- experiments/ RESULTS-SUMMARY + twobatch comparison reports +
pruned-index-hypothesis; metrics/ rollups (no raw
transcripts); harness/ runner + compare scripts
- wikis/ wiki-terminalbench-bob + the twobatch arms
(base / skills / both / pruned-corrected)
Public-safety scrub:
- Excluded all raw per-trial sandbox transcripts (kept only metric
rollups + narrative reports).
- Excluded wikis built from internal corpora (procedural-design,
consult-meta, iterative, retroactive, simple-claude, test-paired,
claude) and the build-pattern comparison that ran on them; §3-4 of
RESULTS-SUMMARY reduced to a portable-finding note.
- Rewrote all source-path frontmatter to the generic
trajectories/<session-id>.json form; genericized internal example
names and the benchmark-data dir convention in skills/docs.
- Leak gate (benchmark-data / internal corpus + wiki names / org paths)
passes with zero hits across the tree.
Branched off main; diff touches only explorations/agent-wiki/. Builder
catalog + comparison scripts verified runnable from the new location.
Removes the terminal-bench example wiki from the exploration. Repoints the
README reading-order + layout to wiki-twobatch-skills, fixes the docs that
attributed worked examples to it (schema.md now points at the wiki-twobatch
arms; example index rows retagged), and corrects stale relative links the
docs carried from the original tree (../plugin-source → ../skills,
../WIKIS.md removed, ../experiments/wiki-build-comparison.md → RESULTS-SUMMARY
§3–4, design.md/schema.md cross-links to renamed filenames). Skill example
paths (consult, ingest) repointed off the removed wiki.
Remaining wikis: wiki-twobatch {base, skills, both, pruned}. All intra-doc
relative links resolve; leak gate clean.
📝 WalkthroughWalkthroughThis PR introduces the agent-wiki exploration framework: a system for distilling agent trajectories into a reusable, evidence-grounded knowledge base comprising episodic summaries, atomic guidelines, consolidated clusters, executable skills, and cross-task comparisons. It includes comprehensive design/schema documentation, multi-arm experimental validation across 17 tasks and 8 experiments, and operational skills guiding end-to-end ingestion and querying workflows, plus configuration updates excluding the exploration from lint/scan tooling. ChangesAgent-Wiki Framework Design & Schema
Experimental Validation & Result Analysis
Operational Skills & Configuration
Repository Tooling & Configuration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CI (ruff, mypy, detect-secrets) was scanning explorations/agent-wiki/ as project source — the first content under explorations/ to carry .py files and high-entropy identifiers. Fixes, scoped so generated example artifacts are treated like the already-excluded plugin-source/ and examples/ trees: - ruff: lint + format fixes in the harness scripts + builder; exclude the generated wiki scripts (explorations/agent-wiki/wikis/) via extend-exclude. - mypy: add explorations/agent-wiki/wikis/ to exclude; add file-local `# mypy: ignore-errors` to the exploration harness + the builder (a verbatim copy of the mypy-excluded plugin-source/ original). - detect-secrets: exclude explorations/agent-wiki/ in the pre-commit hook and .secrets.baseline — the 53 findings are 12-hex guideline content hashes and session-id UUIDs, not secrets. No example-wiki content changed (scripts keep their original names). Fixes failing CI checks: check-formatting, check-linting, check-typing, tekton/pr-code-checks/code-detect-secrets.
Drops explorations/agent-wiki/wikis/ (253 generated files, ~10k lines) from this PR so the diff is the reviewable surface — skills, builder, docs, and the experiment reports/harness (~34 files). The example wikis are machine- generated output; bundling them buried the code and appears to have made CodeRabbit skip deep review (summary only, zero inline findings). The wikis land in a stacked follow-up PR. README/docs still reference wikis/wiki-twobatch-* by path; those links resolve once the follow-up merges. Root-config excludes (ruff/mypy/detect-secrets) are kept — the detect-secrets exclude still covers example content hashes in docs/schema.md, and the wiki excludes become live again when the follow-up lands.
|
Split the generated example wikis into a companion PR #269 (merge after this one) so this diff stays focused on the reviewable code — builder, skills, docs, and experiment harness (34 files vs the original 287). This should let CodeRabbit review the code properly. |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (13)
explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md (1)
205-213: ⚡ Quick winAdd language specifier to fenced code block.
The directory structure example should use
textor similar language identifier for consistency.📝 Suggested fix
-``` +```text <wiki>/skills/ ├── _id_index.json skill slug → relpath ├── index.md alphabetical listing (auto-generated)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md` around lines 205 - 213, Update the fenced code block in SKILL.md that shows the directory tree for "<wiki>/skills/" to include a language specifier (e.g., change the opening ``` to ```text) so the block is marked as plain text; locate the block in the SKILL.md content that begins with the three backticks followed by the tree and replace the opening fence accordingly to ensure consistent formatting.Source: Linters/SAST tools
explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md (2)
53-55: ⚡ Quick winAdd language specifier to fenced code block.
The code block should specify
bashas the language for proper syntax highlighting and consistency with the rest of the documentation.📝 Suggested fix
-``` +```bash Read <wiki-root>/AGENTS.md</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/agent-wiki-consult/SKILL.mdaround lines 53 -
55, Update the fenced code block containing "Read /AGENTS.md" in
SKILL.md to include a language specifier; specifically, change the backticks
that start the block tobash so the snippet isbash Read
/AGENTS.md ``` which enables Bash syntax highlighting and keeps
formatting consistent with other docs.</details> <!-- cr-comment:v1:112ba0be983257e2722014cb --> _Source: Linters/SAST tools_ --- `72-74`: _⚡ Quick win_ **Add language specifier to fenced code block.** The code block should specify `bash` as the language for proper syntax highlighting and consistency. <details> <summary>📝 Suggested fix</summary> ```diff -``` +```bash Read <wiki-root>/_index.jsonl ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/agent-wiki-consult/SKILL.mdaround lines 72 -
74, Add the language specifier "bash" to the fenced code block that contains the
line "Read /_index.jsonl" in SKILL.md so the block reads as a bash
snippet; locate the triple-backtick fence surrounding that line and change it
fromtobash to enable proper syntax highlighting and consistency.</details> <!-- cr-comment:v1:a82435713f4f0bdb75db895c --> _Source: Linters/SAST tools_ </blockquote></details> <details> <summary>explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md (1)</summary><blockquote> `43-45`: _⚡ Quick win_ **Add language specifier to fenced code block.** The code block should specify `bash` as the language for consistency. <details> <summary>📝 Suggested fix</summary> ```diff -``` +```bash Read /tmp/summaries.json ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.mdaround lines 43 -
45, The fenced code block containing the line "Read /tmp/summaries.json" is
missing a language specifier; update the markdown in SKILL.md by changing the
opening fence fromtobash so the block reads as a bash code block (i.e.,
usebash before the line and keep the closing), ensuring consistency with
other fenced blocks.</details> <!-- cr-comment:v1:8cf4b500a74a459a81dea92b --> _Source: Linters/SAST tools_ </blockquote></details> <details> <summary>explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md (1)</summary><blockquote> `43-45`: _⚡ Quick win_ **Add language specifier to fenced code block.** The code block should specify `bash` as the language for consistency. <details> <summary>📝 Suggested fix</summary> ```diff -``` +```bash Read /tmp/guidelines.json ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md
around lines 43 - 45, The fenced code block containing the line "Read
/tmp/guidelines.json" in SKILL.md should include a language specifier; update
that code fence to use "bash" (i.e., change the openingtobash) so the
block reads as a bash snippet for consistency and proper syntax highlighting.</details> <!-- cr-comment:v1:5bf51a6b4180929ec5599bc9 --> _Source: Linters/SAST tools_ </blockquote></details> <details> <summary>explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md (1)</summary><blockquote> `26-35`: _⚡ Quick win_ **Add language specifier to fenced code block.** The pipeline diagram should use `text` or similar language identifier for consistency. <details> <summary>📝 Suggested fix</summary> ```diff -``` +```text 0. Convert raw bob / claude traces → normalized analysis JSON (skip if already normalized) 1. Bootstrap create wiki scaffold + seed catalog (skip if wiki exists) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.mdaround lines 26 -
35, The fenced code block that lists the pipeline steps (starting with "0.
Convert raw bob / claude traces → normalized analysis JSON") in SKILL.md is
missing a language specifier; update the opening fence fromtotext (or
another plain language liketext) so the block reads ```text and preserves
formatting/consistency across renderers.</details> <!-- cr-comment:v1:6330421252c87cc7389c0754 --> _Source: Linters/SAST tools_ </blockquote></details> <details> <summary>explorations/agent-wiki/skills/scripts/_default_agents.md (3)</summary><blockquote> `35-55`: _💤 Low value_ **Add language specifier to fenced code block.** The directory structure code block should specify a language (e.g., `text`) to satisfy markdown linting best practices. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text <wiki-root>/ ├── AGENTS.md ← this file ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/scripts/_default_agents.mdaround lines 35 -
55, The fenced directory-structure block at the top of AGENTS.md is missing a
language tag; update the opening fence fromtotext so the block is
marked as plain text (modify the code block that begins with the tree under
/ in AGENTS.md).</details> <!-- cr-comment:v1:f6308da14e8d4796aead671d --> _Source: Linters/SAST tools_ --- `123-131`: _💤 Low value_ **Add language specifier to fenced code block.** The provenance chain code block should specify a language (e.g., `text`) to satisfy markdown linting best practices. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text guideline.md ↓ frontmatter `related_summary:` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@explorations/agent-wiki/skills/scripts/_default_agents.mdaround lines 123 -
131, The fenced provenance chain block in _default_agents.md lacks a language
specifier which fails markdown linting; update the opening triple-backtick for
that block to include a language (for example changetotext) so the
block becomes a labeled text code fence, ensuring the block around the lines
starting with "guideline.md" and the following arrows is annotated (look for the
triple-backtick that opens that specific provenance chain block).</details> <!-- cr-comment:v1:2ab25b3aa9b4a9092e5a0af8 --> _Source: Linters/SAST tools_ --- `151-152`: _⚡ Quick win_ **Clarify the guideline reference format in the example.** Line 152 shows `474bb2ba1076` as a guideline reference, but according to the structure at line 46, atomic guideline files follow the pattern `<slug>__<gid>.md`. The example should either show the full filename or clarify that this is just the ID portion from the `_index.jsonl`. <details> <summary>📝 Suggested clarification</summary> ```diff - - `474bb2ba1076` "Promote a feature mode to a top-level flag, not an - option" (atomic; tags include `cli`, `ux`, `workspace`). + - `474bb2ba1076` (id) "Promote a feature mode to a top-level flag, not an + option" (atomic; tags include `cli`, `ux`, `workspace`; + file: `guidelines/promote-mode-to-flag__474bb2ba1076.md`). ``` Or simply add a note that IDs shown are from `_index.jsonl` and map to full filenames via the pattern. </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/skills/scripts/_default_agents.md` around lines 151 - 152, Update the example in _default_agents.md so the guideline reference is unambiguous: replace or augment the bare ID `474bb2ba1076` with the full atomic filename pattern `<slug>__474bb2ba1076.md` (or show the actual example full filename), or add a short note stating that IDs shown (like `474bb2ba1076`) are the gid field from `_index.jsonl` and map to atomic files via the `<slug>__<gid>.md` convention; ensure this change is applied where the example appears so readers can map the ID to the full guideline filename. ``` </details> <!-- cr-comment:v1:6be9e0d1da5cd4915083e5de --> </blockquote></details> <details> <summary>explorations/agent-wiki/experiments/harness/twobatch_compare.py (2)</summary><blockquote> `232-249`: _💤 Low value_ **Move `TASK_IDS_ORDER` constant to module top for conventional readability.** The constant is defined at line 232 but used in `main()` at line 192. While Python's lazy evaluation makes this work, conventional practice places module-level constants immediately after imports. This improves readability and makes it clear which values are configurable. <details> <summary>Suggested placement</summary> Move lines 232-249 to appear immediately after line 21 (after `REPO` definition and before `FAMILY`). </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py` around lines 232 - 249, TASK_IDS_ORDER is defined far down the file but used in main(); move the TASK_IDS_ORDER constant up to the top-level constants area (immediately after the existing REPO constant and before FAMILY) so it’s declared with other module-level configuration. Update its placement only — keep the exact name TASK_IDS_ORDER and do not change its contents or usages (e.g., references inside main()) to restore conventional readability. ``` </details> <!-- cr-comment:v1:f6a2be203b773f2ca558acac --> --- `23-79`: _⚡ Quick win_ **Consider extracting shared constants and utilities to reduce duplication.** Both `twobatch_compare.py` and `threeway_compare.py` duplicate `FAMILY` dict, `TASK_IDS_ORDER` list, and helper functions (`median`/`mean`, `fmt`, `delta`/`delta_str`, `acc`). For exploration code, self-contained scripts may be intentional, but if these tools will be maintained or extended, consolidating ~80 lines of shared logic into a `comparison_utils.py` module would reduce drift and simplify updates. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py` around lines 23 - 79, Extract the duplicated constants and helper functions into a new module (e.g., comparison_utils.py): move FAMILY and TASK_IDS_ORDER plus median_or_none, mean_or_none, fmt, delta_str (and any other shared helpers like acc/delta) into that module, update twobatch_compare.py and threeway_compare.py to import those symbols instead of redefining them, and remove the duplicate definitions from both files so they reference the shared implementations. ``` </details> <!-- cr-comment:v1:5a2f1f797deec619cfc823db --> </blockquote></details> <details> <summary>explorations/agent-wiki/experiments/harness/fourway_compare.py (1)</summary><blockquote> `20-20`: _⚡ Quick win_ **Remove unused `REPO` variable in both comparison scripts.** Both `fourway_compare.py` and `fiveway_compare.py` define `REPO = Path(__file__).resolve().parents[1]` but never reference it. This appears to be copy-paste boilerplate that can be removed. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/experiments/harness/fourway_compare.py` at line 20, Remove the unused REPO variable declaration (REPO = Path(__file__).resolve().parents[1]) from the comparison scripts; locate the top-level REPO assignment in fourway_compare.py and fiveway_compare.py and delete that line to eliminate dead/copy-pasted boilerplate, ensuring no other code references REPO afterward. ``` </details> <!-- cr-comment:v1:82bef0e5f751d47f51a6fc55 --> </blockquote></details> <details> <summary>explorations/agent-wiki/experiments/RESULTS-SUMMARY.md (1)</summary><blockquote> `408-432`: _💤 Low value_ **Add language identifier to fenced code block.** The file-map code fence should specify `text` or leave it blank explicitly. As per static analysis, fenced code blocks should have a language specified. <details> <summary>🔧 Proposed fix</summary> ````diff -``` +```text explorations/agent-wiki/experiments/ ├── RESULTS-SUMMARY.md this file🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md` around lines 408 - 432, The fenced directory tree at the top of RESULTS-SUMMARY.md is missing a language identifier; update the opening code fence for the file-map block (the triple-backticks that surround the explorations/agent-wiki/experiments/ tree) to include a language identifier (e.g., change ``` to ```text) so static analysis recognizes the block language.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py`:
- Line 47: REPO_ROOT is set using Path(__file__).resolve().parents[2] which
points to explorations/agent-wiki/ (too shallow); change the parent index to
parents[4] so REPO_ROOT points to the repository root; update the expression
Path(__file__).resolve().parents[2] to Path(__file__).resolve().parents[4]
(symbol: REPO_ROOT) so all derived paths that use REPO_ROOT (used later on lines
referencing the same variable) resolve correctly.
- Around line 425-426: The median calculation using median = durs[n // 2] is
incorrect for even-length lists; update the logic after computing durs and n so
that you compute mid = n // 2 and set median = durs[mid] when n is odd,
otherwise set median = (durs[mid - 1] + durs[mid]) / 2.0 to return the average
of the two middle values (ensure the result is a float); reference the existing
variables durs, n, median and rows to locate and replace the incorrect line.
- Line 310: The tasks_file path is pointing to the wrong location; update the
tasks_file assignment (the variable named tasks_file which currently uses
REPO_ROOT / "tests" / "e2e" / "wiki_consult_tasks.yaml") to the actual file
location by constructing the path as REPO_ROOT / "explorations" / "agent-wiki" /
"experiments" / "harness" / "wiki_consult_tasks.yaml" so the script can load the
correct YAML; ensure you use the same Path joining style used elsewhere with
REPO_ROOT.
- Around line 114-120: The helper _seed_format_group currently does a local
import "from _format_samples import seed_into" but there is no _format_samples
module in the harness directory, causing runtime import errors; fix by providing
a real implementation or correct import: either add a new _format_samples.py
next to experiment_wiki_consult.py that implements seed_into(ws: Path, group:
str) -> list[str], or change the import in _seed_format_group to point to the
existing module that provides seed_into (or inline the seed_into logic into
_seed_format_group) so that the function _seed_format_group calls a valid
seed_into symbol at runtime.
In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md`:
- Line 368: Replace the typo "byes" with "bytes" in the sentence fragment that
currently reads "it reads MORE byes (cache-creation..." in RESULTS-SUMMARY.md so
the sentence reads "it reads MORE bytes (cache-creation..."; search for the
exact phrase "it reads MORE byes" to locate the spot and update only that word.
In `@explorations/agent-wiki/skills/scripts/_default_agents.md`:
- Around line 167-171: Update the bootstrap command path in the markdown so it
points to the correct script location: replace occurrences of
"plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py" with
"explorations/agent-wiki/skills/scripts/build_agent_wiki.py" in the AGENTS.md
bootstrap instructions (the sentence containing the command `uv run python ...
build_agent_wiki.py --wiki-root <wiki-root> catalog`) so the documented command
matches the actual repository layout.
- Around line 28-29: The Structure section in _default_agents.md is missing the
generated guidelines/index.md entry; update the Structure listing to include
guidelines/index.md alongside the existing guidelines/<slug>__<gid>.md and
guidelines/<slug>__cluster.md entries so the docs match what
explorations/agent-wiki/skills/scripts/build_agent_wiki.py generates; edit the
"Structure" block in _default_agents.md to add a line for guidelines/index.md
and ensure formatting/ordering is consistent with the other guideline entries.
---
Nitpick comments:
In `@explorations/agent-wiki/experiments/harness/fourway_compare.py`:
- Line 20: Remove the unused REPO variable declaration (REPO =
Path(__file__).resolve().parents[1]) from the comparison scripts; locate the
top-level REPO assignment in fourway_compare.py and fiveway_compare.py and
delete that line to eliminate dead/copy-pasted boilerplate, ensuring no other
code references REPO afterward.
In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py`:
- Around line 232-249: TASK_IDS_ORDER is defined far down the file but used in
main(); move the TASK_IDS_ORDER constant up to the top-level constants area
(immediately after the existing REPO constant and before FAMILY) so it’s
declared with other module-level configuration. Update its placement only — keep
the exact name TASK_IDS_ORDER and do not change its contents or usages (e.g.,
references inside main()) to restore conventional readability.
- Around line 23-79: Extract the duplicated constants and helper functions into
a new module (e.g., comparison_utils.py): move FAMILY and TASK_IDS_ORDER plus
median_or_none, mean_or_none, fmt, delta_str (and any other shared helpers like
acc/delta) into that module, update twobatch_compare.py and threeway_compare.py
to import those symbols instead of redefining them, and remove the duplicate
definitions from both files so they reference the shared implementations.
In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md`:
- Around line 408-432: The fenced directory tree at the top of
RESULTS-SUMMARY.md is missing a language identifier; update the opening code
fence for the file-map block (the triple-backticks that surround the
explorations/agent-wiki/experiments/ tree) to include a language identifier
(e.g., change ``` to ```text) so static analysis recognizes the block language.
In `@explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md`:
- Around line 43-45: The fenced code block containing the line "Read
/tmp/guidelines.json" in SKILL.md should include a language specifier; update
that code fence to use "bash" (i.e., change the opening ``` to ```bash) so the
block reads as a bash snippet for consistency and proper syntax highlighting.
In `@explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md`:
- Around line 53-55: Update the fenced code block containing "Read
<wiki-root>/AGENTS.md" in SKILL.md to include a language specifier;
specifically, change the backticks that start the block to ```bash so the
snippet is ```bash Read <wiki-root>/AGENTS.md ``` which enables Bash syntax
highlighting and keeps formatting consistent with other docs.
- Around line 72-74: Add the language specifier "bash" to the fenced code block
that contains the line "Read <wiki-root>/_index.jsonl" in SKILL.md so the block
reads as a bash snippet; locate the triple-backtick fence surrounding that line
and change it from ``` to ```bash to enable proper syntax highlighting and
consistency.
In `@explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md`:
- Around line 26-35: The fenced code block that lists the pipeline steps
(starting with "0. Convert raw bob / claude traces → normalized analysis
JSON") in SKILL.md is missing a language specifier; update the opening fence
from ``` to ```text (or another plain language like `text`) so the block reads
```text and preserves formatting/consistency across renderers.
In `@explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md`:
- Around line 205-213: Update the fenced code block in SKILL.md that shows the
directory tree for "<wiki>/skills/" to include a language specifier (e.g.,
change the opening ``` to ```text) so the block is marked as plain text; locate
the block in the SKILL.md content that begins with the three backticks followed
by the tree and replace the opening fence accordingly to ensure consistent
formatting.
In `@explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md`:
- Around line 43-45: The fenced code block containing the line "Read
/tmp/summaries.json" is missing a language specifier; update the markdown in
SKILL.md by changing the opening fence from ``` to ```bash so the block reads as
a bash code block (i.e., use ```bash before the line and keep the closing ```),
ensuring consistency with other fenced blocks.
In `@explorations/agent-wiki/skills/scripts/_default_agents.md`:
- Around line 35-55: The fenced directory-structure block at the top of
AGENTS.md is missing a language tag; update the opening fence from ``` to
```text so the block is marked as plain text (modify the code block that begins
with the tree under <wiki-root>/ in AGENTS.md).
- Around line 123-131: The fenced provenance chain block in _default_agents.md
lacks a language specifier which fails markdown linting; update the opening
triple-backtick for that block to include a language (for example change ``` to
```text) so the block becomes a labeled text code fence, ensuring the block
around the lines starting with "guideline.md" and the following arrows is
annotated (look for the triple-backtick that opens that specific provenance
chain block).
- Around line 151-152: Update the example in _default_agents.md so the guideline
reference is unambiguous: replace or augment the bare ID `474bb2ba1076` with the
full atomic filename pattern `<slug>__474bb2ba1076.md` (or show the actual
example full filename), or add a short note stating that IDs shown (like
`474bb2ba1076`) are the gid field from `_index.jsonl` and map to atomic files
via the `<slug>__<gid>.md` convention; ensure this change is applied where the
example appears so readers can map the ID to the full guideline filename.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 03d507b1-88b4-4d95-926c-e74bcaf7cd25
📒 Files selected for processing (34)
.pre-commit-config.yaml.secrets.baselineexplorations/agent-wiki/README.mdexplorations/agent-wiki/docs/design.mdexplorations/agent-wiki/docs/schema.mdexplorations/agent-wiki/experiments/RESULTS-SUMMARY.mdexplorations/agent-wiki/experiments/harness/experiment_wiki_consult.pyexplorations/agent-wiki/experiments/harness/extract_trial_metrics.pyexplorations/agent-wiki/experiments/harness/fiveway_compare.pyexplorations/agent-wiki/experiments/harness/fourway_compare.pyexplorations/agent-wiki/experiments/harness/normalize_stream_json_transcripts.pyexplorations/agent-wiki/experiments/harness/threeway_compare.pyexplorations/agent-wiki/experiments/harness/twobatch_compare.pyexplorations/agent-wiki/experiments/harness/wiki_consult_tasks.yamlexplorations/agent-wiki/experiments/metrics/pruned-fixed-9atomic.metrics.jsonlexplorations/agent-wiki/experiments/metrics/twobatch-both.metrics.jsonlexplorations/agent-wiki/experiments/metrics/twobatch-skills.metrics.jsonlexplorations/agent-wiki/experiments/metrics/twobatch.metrics.jsonlexplorations/agent-wiki/experiments/pruned-index-hypothesis.mdexplorations/agent-wiki/experiments/twobatch-comparison.mdexplorations/agent-wiki/experiments/twobatch-fiveway-comparison.mdexplorations/agent-wiki/experiments/twobatch-fourway-comparison.mdexplorations/agent-wiki/experiments/twobatch-skills-comparison.mdexplorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-consult/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-extract-guidelines/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-ingest/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-summarize/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.mdexplorations/agent-wiki/skills/agent-wiki-tasks/SKILL.mdexplorations/agent-wiki/skills/scripts/_default_agent_wiki_config.yamlexplorations/agent-wiki/skills/scripts/_default_agents.mdexplorations/agent-wiki/skills/scripts/build_agent_wiki.pypyproject.toml
|
|
||
| import yaml | ||
|
|
||
| REPO_ROOT = Path(__file__).resolve().parents[2] |
There was a problem hiding this comment.
REPO_ROOT calculation resolves to wrong directory.
The script is located at explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py, which is 4 levels deep from the repository root. Path(__file__).resolve().parents[2] resolves to explorations/agent-wiki/, not the repository root. All derived paths (lines 61, 64, 67) will be incorrect.
🔧 Proposed fix
-REPO_ROOT = Path(__file__).resolve().parents[2]
+REPO_ROOT = Path(__file__).resolve().parents[4]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| REPO_ROOT = Path(__file__).resolve().parents[2] | |
| REPO_ROOT = Path(__file__).resolve().parents[4] |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py` at
line 47, REPO_ROOT is set using Path(__file__).resolve().parents[2] which points
to explorations/agent-wiki/ (too shallow); change the parent index to parents[4]
so REPO_ROOT points to the repository root; update the expression
Path(__file__).resolve().parents[2] to Path(__file__).resolve().parents[4]
(symbol: REPO_ROOT) so all derived paths that use REPO_ROOT (used later on lines
referencing the same variable) resolve correctly.
| def _seed_format_group(ws: Path, group: str) -> list[str]: | ||
| """Seed image/archive/text format samples via the stdlib generators in | ||
| `_format_samples.py`. Group is one of `image-formats`, `archive-formats`, | ||
| `text-formats`.""" | ||
| from _format_samples import seed_into # local import — script lives next door | ||
|
|
||
| return seed_into(ws, group) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if _format_samples.py exists in the harness directory
fd -t f "_format_samples.py" explorations/agent-wiki/experiments/harness/Repository: AgentToolkit/altk-evolve
Length of output: 50
🏁 Script executed:
#!/bin/bash
set -euo pipefail
dir="explorations/agent-wiki/experiments/harness"
echo "== harness directory listing =="
ls -la "$dir"
echo
echo "== files matching _format_samples* in harness =="
fd -t f "_format_samples*" "$dir" || true
echo
echo "== any reference to _format_samples in repo (top 50 matches) =="
rg -n --hidden --no-ignore-vcs "_format_samples" . | head -n 50
echo
echo "== any definition of seed_into in repo (top 50 matches) =="
rg -n --hidden --no-ignore-vcs "def seed_into" . | head -n 50Repository: AgentToolkit/altk-evolve
Length of output: 1295
🌐 Web query:
Search for _format_samples.pyandseed_into in the AgentToolkit/altk-evolve repository if available in public docs (or confirm expected module location for harness format seeding).
💡 Result:
The altk-evolve framework, managed by the AgentToolkit, does not publicly expose a module or function named _format_samples.py or seed_into in its main documentation or repository structure. Based on official documentation for altk-evolve: 1. Entity Management: Evolve manages knowledge (tips/guidelines) either via an MCP server with Milvus vector storage or as Markdown files with YAML frontmatter in .evolve/entities/ (specifically in the evolve-lite integration) [1][2]. 2. Trajectory Seeding and Guidelines: The system uses tools such as save_trajectory to extract insights and generate guidelines from execution traces [1][3]. The generation process is handled automatically through evolve sync (for observability traces) or via explicit tool calls (like /evolve-lite:learn) [2][4]. 3. Provenance: Every guideline created is assigned metadata, including creation_mode (e.g., auto-phoenix, auto-mcp, or manual) and source_task_id, rather than being injected via a seed_into function [1][4]. It is possible that _format_samples.py or seed_into are internal implementation details or components of a private or older version of the toolkit, but they are not part of the documented public API for harness format seeding or knowledge base initialization in the current stable version [3][5]. If you are working with a specific development branch, you may want to check the src/ or internal/ directories directly in the local repository clone, as these are not exposed in the high-level API docs [1].
Citations:
- 1: https://github.com/AgentToolkit/altk-evolve
- 2: https://github.com/AgentToolkit/altk-evolve/blob/main/docs/integrations/claude/evolve-lite.md
- 3: https://pypi.org/project/altk-evolve/
- 4: https://github.com/AgentToolkit/altk-evolve/blob/main/docs/guides/low-code-tracing.md
- 5: https://agenttoolkit.github.io/altk-evolve/installation/
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== fd any _format_samples.py anywhere in repo =="
fd -t f "_format_samples.py" . || true
echo
echo "== fd any _format_samples.* anywhere in repo (loose) =="
fd -t f "_format_samples.*" . || true
echo
echo "== rg references to 'seed_into(' across repo (top 50) =="
rg -n --hidden --no-ignore-vcs "seed_into\s*\(" . | head -n 50 || true
echo
echo "== rg references to '_format_samples' across repo (top 50) =="
rg -n --hidden --no-ignore-vcs "_format_samples" . | head -n 50 || trueRepository: AgentToolkit/altk-evolve
Length of output: 719
🏁 Script executed:
#!/bin/bash
set -euo pipefail
f="explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py"
y="explorations/agent-wiki/experiments/harness/wiki_consult_tasks.yaml"
echo "== Python: find make_workspace and seed handling =="
rg -n "def make_workspace|make_workspace|seed" "$f" | head -n 80
echo
echo "== Python: show _seed_format_group area (lines ~90-140) =="
python - <<'PY'
import itertools
path="explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py"
start=90
end=150
with open(path,'r',encoding='utf-8') as fp:
for i,line in enumerate(fp, start=1):
if start<=i<=end:
print(f"{i:4d}: {line.rstrip()}")
if i>end:
break
PY
echo
echo "== YAML: show tasks that mention 'image-formats'/'archive-formats'/'text-formats' =="
rg -n "image-formats|archive-formats|text-formats" "$y" || true
echo
echo "== YAML: show full file (<=200 lines) or relevant blocks =="
wc -l "$y"
if [ "$(wc -l < "$y")" -le 200 ]; then
nl -ba "$y"
else
# print first and last plus around matches
nl -ba "$y" | head -n 60
rg -n "image-formats|archive-formats|text-formats" "$y" -n || true
nl -ba "$y" | tail -n 60
fiRepository: AgentToolkit/altk-evolve
Length of output: 4383
🏁 Script executed:
#!/bin/bash
set -euo pipefail
y="explorations/agent-wiki/experiments/harness/wiki_consult_tasks.yaml"
echo "== seed entries =="
rg -n "^\s*seed:\s*(image-formats|archive-formats|text-formats)\s*$" "$y" || true
echo
echo "== total lines containing seed: =="
rg -n "^\s*seed:\s*" "$y" | wc -l
echo
echo "== total task items (lines starting with '-' ) =="
rg -n "^\s*-\s" "$y" | wc -l
echo
echo "== show YAML (for context, whole file <=200 lines) =="
sed -n '1,200p' "$y"Repository: AgentToolkit/altk-evolve
Length of output: 5179
Fix missing _format_samples import for format seeding
experiment_wiki_consult.py imports seed_into via from _format_samples import seed_into (assuming _format_samples.py sits next to the script), but _format_samples.py is not present in explorations/agent-wiki/experiments/harness/, and there is no seed_into implementation anywhere else in this repo—so the format-seeding tasks will fail at runtime unless _format_samples is generated/copied in elsewhere. Blocks 12/17 trials: t6-t9, t10-t13, t14-t17.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py`
around lines 114 - 120, The helper _seed_format_group currently does a local
import "from _format_samples import seed_into" but there is no _format_samples
module in the harness directory, causing runtime import errors; fix by providing
a real implementation or correct import: either add a new _format_samples.py
next to experiment_wiki_consult.py that implements seed_into(ws: Path, group:
str) -> list[str], or change the import in _seed_format_group to point to the
existing module that provides seed_into (or inline the seed_into logic into
_seed_format_group) so that the function _seed_format_group calls a valid
seed_into symbol at runtime.
| return 2 | ||
|
|
||
| # Load tasks (--task may be comma-separated) | ||
| tasks_file = REPO_ROOT / "tests" / "e2e" / "wiki_consult_tasks.yaml" |
There was a problem hiding this comment.
tasks_file path points to wrong location.
The code looks for wiki_consult_tasks.yaml at tests/e2e/wiki_consult_tasks.yaml, but the actual file is located at explorations/agent-wiki/experiments/harness/wiki_consult_tasks.yaml. The script will fail when attempting to load tasks.
🔧 Proposed fix
- tasks_file = REPO_ROOT / "tests" / "e2e" / "wiki_consult_tasks.yaml"
+ tasks_file = Path(__file__).parent / "wiki_consult_tasks.yaml"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py` at
line 310, The tasks_file path is pointing to the wrong location; update the
tasks_file assignment (the variable named tasks_file which currently uses
REPO_ROOT / "tests" / "e2e" / "wiki_consult_tasks.yaml") to the actual file
location by constructing the path as REPO_ROOT / "explorations" / "agent-wiki" /
"experiments" / "harness" / "wiki_consult_tasks.yaml" so the script can load the
correct YAML; ensure you use the same Path joining style used elsewhere with
REPO_ROOT.
| durs = sorted(r["duration_s"] for r in rows) | ||
| median = durs[n // 2] |
There was a problem hiding this comment.
Median calculation is incorrect for even-length trial lists.
The code uses median = durs[n // 2], which picks the first middle value for even-length lists instead of averaging the two middle values. For example, with durations [1, 2, 3, 4], it returns 3 instead of 2.5.
This doesn't affect the default --trials 3 case (odd length), but will produce incorrect medians for even trial counts.
📊 Proposed fix for correct median
durs = sorted(r["duration_s"] for r in rows)
- median = durs[n // 2]
+ median = durs[n // 2] if n % 2 == 1 else (durs[n // 2 - 1] + durs[n // 2]) / 2
md_lines.append(f"| {condition:<10} | {rd}/{n} | {ct}/{n} | {om}/{n} | {median:.0f} |")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| durs = sorted(r["duration_s"] for r in rows) | |
| median = durs[n // 2] | |
| durs = sorted(r["duration_s"] for r in rows) | |
| median = durs[n // 2] if n % 2 == 1 else (durs[n // 2 - 1] + durs[n // 2]) / 2 |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py`
around lines 425 - 426, The median calculation using median = durs[n // 2] is
incorrect for even-length lists; update the logic after computing durs and n so
that you compute mid = n // 2 and set median = durs[mid] when n is odd,
otherwise set median = (durs[mid - 1] + durs[mid]) / 2.0 to return the average
of the two middle values (ensure the result is a float); reference the existing
variables durs, n, median and rows to locate and replace the incorrect line.
|
|
||
| 5. **Cost reduction comes mainly from output tokens and tool-call | ||
| reduction**, not from input-token compression. The agent doesn't read | ||
| *fewer* bytes when it has the wiki — it reads MORE byes (cache-creation |
There was a problem hiding this comment.
Fix typo: "byes" should be "bytes".
The sentence reads "it reads MORE byes" but should be "it reads MORE bytes".
📝 Proposed fix
- *fewer* bytes when it has the wiki — it reads MORE byes (cache-creation
+ *fewer* bytes when it has the wiki — it reads MORE bytes (cache-creation📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| *fewer* bytes when it has the wiki — it reads MORE byes (cache-creation | |
| *fewer* bytes when it has the wiki — it reads MORE bytes (cache-creation |
🧰 Tools
🪛 LanguageTool
[grammar] ~368-~368: Ensure spelling is correct
Context: ...es when it has the wiki — it reads MORE byes (cache-creation on guideline pages g...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md` at line 368, Replace
the typo "byes" with "bytes" in the sentence fragment that currently reads "it
reads MORE byes (cache-creation..." in RESULTS-SUMMARY.md so the sentence reads
"it reads MORE bytes (cache-creation..."; search for the exact phrase "it reads
MORE byes" to locate the spot and update only that word.
Source: Linters/SAST tools
| clearly outside the wiki's scope (the corpus is finite — see | ||
| `guidelines/index.md` for the topical surface). |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for guidelines/index.md references and existence
# Search for any mention of guidelines/index.md in the exploration
rg -n "guidelines/index\.md" explorations/agent-wiki/
# Check the actual directory structure in any example wikis
fd -t f "index.md" explorations/agent-wiki/ --exec echo "Found: {}"Repository: AgentToolkit/altk-evolve
Length of output: 760
Include guidelines/index.md in the “Structure” tree shown in _default_agents.md
explorations/agent-wiki/skills/scripts/build_agent_wiki.py refreshes/generates guidelines/index.md, so the line-29 link isn’t stale. Update the “Structure” listing (around lines 41-54) to add guidelines/index.md alongside the existing guidelines/<slug>__<gid>.md, guidelines/<slug>__cluster.md, etc., so the documented structure matches the generated wiki.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/skills/scripts/_default_agents.md` around lines 28 -
29, The Structure section in _default_agents.md is missing the generated
guidelines/index.md entry; update the Structure listing to include
guidelines/index.md alongside the existing guidelines/<slug>__<gid>.md and
guidelines/<slug>__cluster.md entries so the docs match what
explorations/agent-wiki/skills/scripts/build_agent_wiki.py generates; edit the
"Structure" block in _default_agents.md to add a line for guidelines/index.md
and ensure formatting/ordering is consistent with the other guideline entries.
| If `AGENTS.md` does not exist in a wiki, run | ||
| `uv run python plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py | ||
| --wiki-root <wiki-root> catalog` — the bootstrap pass copies the template | ||
| in. After bootstrap, this file is yours to edit; subsequent catalog runs | ||
| do not overwrite an existing `AGENTS.md`. |
There was a problem hiding this comment.
Fix the incorrect build script path.
Line 168 references plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py, but based on the PR structure, the correct path should be explorations/agent-wiki/skills/scripts/build_agent_wiki.py. This breaks the bootstrapping instructions.
🔧 Proposed fix
If `AGENTS.md` does not exist in a wiki, run
-`uv run python plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py
+`uv run python explorations/agent-wiki/skills/scripts/build_agent_wiki.py
--wiki-root <wiki-root> catalog` — the bootstrap pass copies the template📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| If `AGENTS.md` does not exist in a wiki, run | |
| `uv run python plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py | |
| --wiki-root <wiki-root> catalog` — the bootstrap pass copies the template | |
| in. After bootstrap, this file is yours to edit; subsequent catalog runs | |
| do not overwrite an existing `AGENTS.md`. | |
| If `AGENTS.md` does not exist in a wiki, run | |
| `uv run python explorations/agent-wiki/skills/scripts/build_agent_wiki.py | |
| --wiki-root <wiki-root> catalog` — the bootstrap pass copies the template | |
| in. After bootstrap, this file is yours to edit; subsequent catalog runs | |
| do not overwrite an existing `AGENTS.md`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@explorations/agent-wiki/skills/scripts/_default_agents.md` around lines 167 -
171, Update the bootstrap command path in the markdown so it points to the
correct script location: replace occurrences of
"plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py" with
"explorations/agent-wiki/skills/scripts/build_agent_wiki.py" in the AGENTS.md
bootstrap instructions (the sentence containing the command `uv run python ...
build_agent_wiki.py --wiki-root <wiki-root> catalog`) so the documented command
matches the actual repository layout.
Related to #256 — this is a prototype of offline trajectory-mining + consolidation ("dreaming"): reviewing saved trajectories to extract, consolidate, deduplicate, and curate memory outside the main task loop, with an auditable record of what changed.
What this is
An exploration in turning agent trajectories into a reusable, evidence-grounded wiki that future agents consult before acting — plus the experiments measuring whether it helps. Everything lives self-contained under
explorations/agent-wiki/.The core idea: after an agent finishes a task, distill its trajectory into wiki pages — episodic summaries, atomic guidelines, themed cluster pages, and executable skills — each linked back to the trajectory that produced it. A future agent, pointed at the wiki's
AGENTS.md, retrieves the pages relevant to its task and applies them instead of re-deriving the recipe.How this maps to #256 ("dreaming")
agent-wiki-summarize/-extract-guidelines/-synthesize-skill(retroactive + batch ingest)agent-wiki-consolidate-guidelines→ cluster pages--archive-covered), recall roll-up, priority tiers_audit.log+ provenance back-links on every pageLayout
Headline findings (
experiments/RESULTS-SUMMARY.md)CLAUDE.mdpointer is read 3/3; a soft one 1/3.Scope / data note
These are benchmark-derived example wikis (a synthetic 16-task file-format corpus). Raw per-trial sandbox transcripts and any wikis built from internal trajectory corpora are intentionally excluded — only metric rollups, narrative reports, and the benchmark-derived wikis are included. Source links in wiki frontmatter are shown in the generic
trajectories/<session-id>.jsonform. The skills are a standalone reference copy, not wired into a plugin loader.Summary by CodeRabbit
New Features
Documentation
Chores