feat(kb): hierarchical organization Phase 1 — project MOCs + de-hubbed index — 0.23.0#11
Merged
Merged
Conversation
Brainstormed + web-researched (5-facet parallel sweep: PARA/MOC, graph-hub hygiene, GraphRAG hierarchical communities, taxonomy-vs-tags, auto-maintenance). All facets converged: keep files FLAT (type-folders), make hierarchy a SOFT overlay projected from the edge log — a project: facet + page->page part_of + deterministically-generated project/theme MOC pages + a de-hubbed two-tier index. Structure becomes a pure projection regenerated idempotently each reindex, so the editing skills (extractor/maintainer/dream/reindex/lint) preserve it by keeping edges + facets correct — fix the source, never re-file by hand. Root cause confirmed from the live KB: 461 relates : 9 part_of (hierarchy drowned), flat index.md hubs all 108 pages, no project facet so siblings scatter across type-folders. Decision: one full spec (all phases), project assignment = registry + LLM-fallback (staged). Honors the heterogeneous-main-groups invariant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
8 bite-sized TDD tasks for the deterministic read-side wins: parseDoc project facet, pure project-MOC builder (>=3 gate), projects/ category, reindex MOC projection, de-hubbed two-tier index (graph:exclude), idempotency guard, one-shot part_of-ancestry backfill, and the build+gate+version task. Phases 2-3 (on-write preservation, lint/drift enforcement) follow as their own plans. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Retargeted from the planned validate test (which would pass vacuously — validate doesn't flag unknown sub-categories) to the real RED-able capability: a generated project-MOC must be PROTECT:category like themes, else FORGET would archive it. Also adds projects to KNOWN_CATEGORIES (created-date inference / type correctness). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dempotent) Tasks 4-6: reindex now collects the project: facet per page, projects one wiki/projects/<slug>.md MOC per project with >= SB_MOC_MIN_MEMBERS (default 3) members (grouped by type, deterministic, FORGET-protected, graph:exclude), and rebuilds index.md as a thin two-tier Home (## Maps of Content -> project/theme MOC links + ## Categories -> per-type counts with plain-text slug rows) marked graph:exclude so a viewer never hubs it. Pure projection: a second reindex is byte-identical modulo the generated timestamp. SB_KB_MOC=off disables. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…versible) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
….4.0 Rebuild bundles (project-moc inlined into knowledge-reindex.bundle), bump plugin + marketplace 0.22.5->0.23.0 lockstep + MCP server 2.3.1->2.4.0, add the 0.23.0 migration row (project: facet + project-MOC projection + de-hubbed two-tier index + projects/ FORGET-protection + one-shot kb-project-backfill seed). Full suite green (61 shell + 281 vitest); validate + migration-row gate pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The deep-review gate caught real bugs in Phase 1; all CRITICAL/HIGH/MEDIUM fixed: - #1/#2 de-hub + idempotency: graph-project.ts and graph-cluster-cli.ts now SKIP the generated MOC dirs (projects/, themes/) — so MOC pages (pure [[slug]] hubs) never re-enter our own clustering, and projectGraphToPages never injects related:/## Dependencies into a MOC (which broke "pure projection" when a project key equals an edge endpoint). A 2-reindex byte-identical test now exercises this. - firstSentence: strip the generated <!-- graph:begin..end --> block before extracting the member description, so a member without a description: frontmatter no longer leaks the projected ## Dependencies block into its MOC row. - #3 collision: knowledge-validate excludes projects/+themes/ from the duplicate_slug check — a project named after an existing page (e.g. architecture-v1) is no longer a recurring error. - #4 stale prune: reindex deletes projects/*.md whose project no longer qualifies, so output depends only on CURRENT input. - #5 clamp: SB_MOC_MIN_MEMBERS NaN/0/negative now falls back to 3 (was: NaN gate made every single-member project a MOC). - history#1 lint: orphan candidate set ($ALL) excludes projects/+themes/ so generated MOCs (path-style [[projects/x]] links, no authored inbound links) aren't false orphans. - bash#1 backfill: find | sort | head -1 (deterministic file pick on duplicate basenames). New regression tests: stale-prune, clamp, MOC-not-mangled (2-reindex), duplicate_slug collision. Full suite green (61 shell + vitest). Advisory (writeProjectMocs helper extraction) + a low backfill edge-case test deferred as non-blocking follow-ups. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces Phase 1 of hierarchical knowledge-base organization by adding a project: (and area:) frontmatter facet and using it to deterministically project project MOC pages plus a de-hubbed two-tier index.md, without moving any existing wiki files. It also updates validation/forget/lint behaviors to treat generated MOC directories as protected, non-source “view” outputs, and bumps the MCP server + plugin versions for release.
Changes:
- Extend
parseDocto exposeproject/areafacets and add tests. - Add deterministic project MOC projection (
wiki/projects/<slug>.md) and de-hubbedindex.mdgeneration toknowledge_reindex, with idempotency + pruning behavior and tests. - Add an optional one-shot
kb-project-backfill.sh(with tests) and update validation/forget scoring/lint to recognize and exclude generated MOC dirs; bump versions/docs.
Reviewed changes
Copilot reviewed 21 out of 57 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test-wiki-forget-projects.sh | Ensures generated type: projects MOCs are category-protected from FORGET. |
| tests/test-kb-project-backfill.sh | Tests deterministic, bounded, idempotent project: backfill via part_of ancestry. |
| skills/upgrade/SKILL.md | Adds the 0.23.0 migration row documenting Phase 1 behavior and optional seed steps. |
| skills/lint/SKILL.md | Excludes generated MOC dirs from orphan detection to avoid false positives. |
| scripts/wiki-forget-score.sh | Treats projects category as protected for FORGET candidate generation. |
| scripts/kb-project-backfill.sh | New one-shot script to backfill project: from registry anchors + part_of graph. |
| mcp/src/tools/project-moc.ts | New pure function for deterministic project MOC markdown generation. |
| mcp/src/tools/project-moc.test.ts | Unit tests for project MOC generation determinism and gating. |
| mcp/src/tools/knowledge-validate.ts | Recognizes projects category and excludes projects/themes from duplicate slug checks. |
| mcp/src/tools/knowledge-validate.test.ts | Tests duplicate slug exemption for generated project MOCs. |
| mcp/src/tools/knowledge-search.ts | Adds project/area fields to ParsedDoc and extracts them in parseDoc. |
| mcp/src/tools/knowledge-search.test.ts | Tests parseDoc project facet extraction/defaulting. |
| mcp/src/tools/knowledge-reindex.ts | Projects project MOCs, prunes stale MOCs, and generates de-hubbed two-tier index. |
| mcp/src/tools/knowledge-reindex.test.ts | Integration tests for MOC writing, index format, idempotency, pruning, env clamp, and edge-endpoint mangle guard. |
| mcp/src/tools/graph-project.ts | Skips projects/ and themes/ to avoid mangling generated MOCs during graph projection. |
| mcp/src/tools/graph-cluster-cli.ts | Excludes generated MOC dirs from clustering input to prevent hub artifacts. |
| mcp/src/server.ts | Bumps MCP server version to 2.4.0. |
| mcp/dist/tools/project-moc.test.js.map | Built artifact for project-moc tests. |
| mcp/dist/tools/project-moc.test.js | Built artifact for project-moc tests. |
| mcp/dist/tools/project-moc.test.d.ts.map | Built artifact for project-moc test typings map. |
| mcp/dist/tools/project-moc.test.d.ts | Built artifact for project-moc test typings. |
| mcp/dist/tools/project-moc.js.map | Built artifact sourcemap for project-moc. |
| mcp/dist/tools/project-moc.js | Built artifact for project-moc. |
| mcp/dist/tools/project-moc.d.ts.map | Built artifact typings map for project-moc. |
| mcp/dist/tools/project-moc.d.ts | Built artifact typings for project-moc. |
| mcp/dist/tools/knowledge-validate.test.js.map | Built artifact sourcemap for updated validate tests. |
| mcp/dist/tools/knowledge-validate.test.js | Built artifact for updated validate tests. |
| mcp/dist/tools/knowledge-validate.js.map | Built artifact sourcemap for updated validate tool. |
| mcp/dist/tools/knowledge-validate.js | Built artifact for updated validate tool. |
| mcp/dist/tools/knowledge-validate.d.ts.map | Built artifact typings map for validate tool. |
| mcp/dist/tools/knowledge-validate.bundle.js | Built bundled artifact including validate changes. |
| mcp/dist/tools/knowledge-search.test.js.map | Built artifact sourcemap for updated search tests. |
| mcp/dist/tools/knowledge-search.test.js | Built artifact for updated search tests. |
| mcp/dist/tools/knowledge-search.js | Built artifact for updated search tool. |
| mcp/dist/tools/knowledge-search.d.ts.map | Built artifact typings map for updated search tool. |
| mcp/dist/tools/knowledge-search.d.ts | Built artifact typings for updated search tool. |
| mcp/dist/tools/knowledge-search-cli.bundle.js | Built bundled artifact including search facet changes. |
| mcp/dist/tools/knowledge-reindex.test.js.map | Built artifact sourcemap for updated reindex tests. |
| mcp/dist/tools/knowledge-reindex.test.js | Built artifact for updated reindex tests. |
| mcp/dist/tools/knowledge-reindex.js.map | Built artifact sourcemap for updated reindex tool. |
| mcp/dist/tools/knowledge-reindex.js | Built artifact for updated reindex tool. |
| mcp/dist/tools/knowledge-reindex.d.ts.map | Built artifact typings map for updated reindex tool. |
| mcp/dist/tools/knowledge-reindex.bundle.js | Built bundled artifact including reindex + MOC projection changes. |
| mcp/dist/tools/graph-project.js.map | Built artifact sourcemap for updated graph projection tool. |
| mcp/dist/tools/graph-project.js | Built artifact for updated graph projection tool. |
| mcp/dist/tools/graph-project.d.ts.map | Built artifact typings map for updated graph projection tool. |
| mcp/dist/tools/graph-cluster-cli.js.map | Built artifact sourcemap for updated graph cluster CLI. |
| mcp/dist/tools/graph-cluster-cli.js | Built artifact for updated graph cluster CLI. |
| mcp/dist/tools/graph-cluster-cli.bundle.js | Built bundled artifact including MOC-dir exclusion. |
| mcp/dist/server.js | Built artifact for MCP server version bump. |
| mcp/dist/server.bundle.js | Built bundled artifact for MCP server version bump + tool changes. |
| mcp/dist/cli/sb-entry.bundle.js | Built CLI bundle incorporating facet parsing changes. |
| docs/specs/2026-06-02-knowledge-base-hierarchical-organization-design.md | New design spec documenting the model, goals, and projection approach. |
| docs/plans/2026-06-02-kb-hierarchical-organization-phase1.md | New step-by-step implementation plan for Phase 1. |
| .claude-plugin/plugin.json | Plugin version bump to 0.23.0. |
| .claude-plugin/marketplace.json | Marketplace version bump to 0.23.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
42
to
44
| for (const filePath of files.sort()) { | ||
| const slug = filePath.split('/').pop()!.replace(/\.md$/, ''); | ||
| try { |
Comment on lines
+71
to
+73
| for (const existing of await mocSlugs(projDir)) { | ||
| if (!mocs.has(existing)) { try { await fs.unlink(join(projDir, `${existing}.md`)); } catch { /* gone */ } } | ||
| } |
Comment on lines
+45
to
49
| // Skip index.md and the generated MOC dirs (projects/, themes/) — they are pure | ||
| // projections; injecting related:/## Dependencies into a MOC would mangle it and break | ||
| // reindex idempotency. | ||
| if (file.endsWith('index.md') || /\/(projects|themes)\//.test(file)) continue; | ||
| const slug = slugFromPath(file); |
Comment on lines
+40
to
+43
| # All wiki page slugs (filename without .md). Exclude the generated MOC dirs | ||
| # (projects/, themes/) — like index.md they are auto-generated projections with no | ||
| # AUTHORED inbound links by design, so they must never be flagged as orphans. | ||
| ALL=$(find "$KD/wiki" -name '*.md' -type f -not -path '*/projects/*' -not -path '*/themes/*' | while read f; do basename "$f" .md; done | sort -u) |
Comment on lines
+31
to
+33
| [ -n "$f" ] || return 0 | ||
| grep -qE '^project:' "$f" && return 0 # idempotent: never overwrite an existing facet | ||
| awk -v p="$proj" ' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The knowledge base read as a flat hairball. Root cause (one): hierarchy lives in the edge log but is drowned + unsurfaced — active edges are ~461
relates: 9part_of, a flatindex.mdlink-hubs all ~108 pages, and there's noproject:facet so a project's notes scatter across type-folders (kiri-*loosely tied;cainish-bridge-*zero grouping). Design:docs/specs/2026-06-02-knowledge-base-hierarchical-organization-design.md(brainstormed + 5-facet web research; all facets converged). Plan:docs/plans/2026-06-02-kb-hierarchical-organization-phase1.md.What (Phase 1 — deterministic read-side wins, no file ever moved)
project:facet —parseDocexposesproject/area; membership is a frontmatter facet (a query), not a synthetic edge.reindexprojects onewiki/projects/<slug>.mdper project with ≥SB_MOC_MIN_MEMBERS(default 3) members, grouped by type; FORGET-protected likethemes/;graph: exclude.index.md— Home →## Maps of Content(project/theme MOC links) +## Categories(per-type counts, plain-text slug rows, no[[wikilinks]]),graph: exclude— so a graph viewer shows clusters, not a 108-edge star.SB_KB_MOC=offdisables.kb-project-backfill.shwalkspart_ofancestry from registry anchors and setsproject:on members (idempotent, reversible). Optional; facets accrue from new edits otherwise.projectsis a known category (validate + FORGET PROTECT). MCP server → 2.4.0. Additive + back-compat: no facets ⇒ prior flat index.Phases 2–3 (on-write facet preservation +
relates→part_ofpromotion + maintainer plurality-vote; lint/drift enforcement) are separate future plans.Release-gate review — 8 findings, 0 false positives, all CRITICAL/HIGH/MEDIUM fixed
A deep review (unit + architectural + history lenses) found real bugs; the last commit fixes them with regression tests:
[[slug]]hubs never re-enter our clustering and a MOC whose key is an edge endpoint isn't mangled by projection (2-reindex byte-identical test).firstSentencestrips the projected## Dependenciesblock before extracting a member's MOC description.SB_MOC_MIN_MEMBERSgarbage/0 falls back to 3 (was: gated nothing).projects/+themes/(no false orphans).Verification
Full suite green (61 shell + 281→ vitest, 0 fail; 1 intentional skip); validate + version lockstep (0.23.0) + migration-row gates pass; pre-push hook green.
🤖 Generated with Claude Code