From 45a3ee2207a17a1ec9a3f90d9d11719fc130d814 Mon Sep 17 00:00:00 2001
From: tikankika <tikankika@users.noreply.github.com>
Date: Wed, 24 Jun 2026 19:26:26 +0200
Subject: [PATCH] chore(acdm): track repo-policy rules in the repo (selective
 .claude gitignore)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

ACDM rule-distribution principle (PR #18/#20): repo-policy rules describe the repo /
are read by repo-tooling and must travel with a clone / CI / contributor
(ADR-015 "protection travels with the repo"). They were gitignored under a wholesale
.claude/ ignore → not in the repo. Selectively un-ignore the three so they are tracked:
- data-protection.md (PII / data-protection policy)
- publish-readiness.md (read by /publish-check)
- internal-docs-boundary.md (what belongs in the repo)

Everything else under .claude/ stays gitignored — process rules, commands, hooks,
acdm.json, .mcp.json, CLAUDE.md (local / path-sensitive config, verified via
git check-ignore). No config or paths are exposed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .claude/rules/data-protection.md        |  58 +++++++++++++
 .claude/rules/internal-docs-boundary.md |  37 +++++++++
 .claude/rules/publish-readiness.md      | 106 ++++++++++++++++++++++++
 .gitignore                              |  12 ++-
 4 files changed, 211 insertions(+), 2 deletions(-)
 create mode 100644 .claude/rules/data-protection.md
 create mode 100644 .claude/rules/internal-docs-boundary.md
 create mode 100644 .claude/rules/publish-readiness.md

diff --git a/.claude/rules/data-protection.md b/.claude/rules/data-protection.md
new file mode 100644
index 0000000..47808c4
--- /dev/null
+++ b/.claude/rules/data-protection.md
@@ -0,0 +1,58 @@
+---
+paths:
+  - "**/*"
+---
+
+# Data Protection — Treat As If Public
+
+Whether or not this repo is private today, treat everything in it as if it were
+already public. A private repo can be made public, forked, cloned, or leaked, and
+anything committed is permanent. The only safe assumption is that every file and
+every past commit is visible to the world.
+
+## Hard rule (non-negotiable)
+
+Real person or student data must **NEVER** exist in this repo — or any repo — in
+any form, anywhere in the working tree **or its git history**. Prevention is the
+only safe path: once committed it lives in the history forever (see
+"Already committed?" below).
+
+## NEVER write in files or commit messages:
+- Personal names (colleagues, research participants, teachers, students)
+- School names or abbreviations that identify specific schools
+- University or institution names
+- Research programme names (funded projects, grants)
+- Place names (streets, buildings, venues) that identify locations
+- Hardcoded file paths containing usernames (`/Users/...`, `/home/...`)
+- Research questions specific enough to identify a study
+- Chat history or session transcripts
+- Secrets: API keys, tokens, passwords, `.env` contents, credentials
+
+## ALWAYS use instead:
+- `School A`, `School B`, `Colleague_A` for anonymised references
+- `/path/to/project` for file path examples
+- `SPEAKER_01`, `L1` for participant references
+- Generic descriptions for research programmes
+- Synthetic/fabricated data in examples
+
+## Check before committing:
+- Think before writing — does this text contain any personal names, paths, or identifiers?
+- Would a reader identify a specific person, school, or study from this text —
+  **directly**, OR by combining quasi-identifiers (e.g. class + date + subject can
+  identify a student without naming them)?
+- This is your judgement. The `pii_scan` commit gate is the deterministic backstop —
+  it catches what you miss, but it is not a substitute for the check above.
+
+## Already committed? Deletion is NOT enough.
+
+If real data is found already in the repo, removing the file in a new commit does
+**not** remove it from git history — it remains in every past commit, clone, and
+fork. To actually remove it you must scrub the history (fresh-repo rebuild or a
+history filter) **and** rotate any exposed secret. Stop and escalate before
+publishing or flipping such a repo.
+
+## This applies to ALL content:
+- Source code, comments, error messages
+- Documentation, RFCs, changelogs, roadmaps
+- Commit messages
+- Test data and examples
diff --git a/.claude/rules/internal-docs-boundary.md b/.claude/rules/internal-docs-boundary.md
new file mode 100644
index 0000000..193f36d
--- /dev/null
+++ b/.claude/rules/internal-docs-boundary.md
@@ -0,0 +1,37 @@
+---
+paths:
+  - "**/*"
+---
+
+# Internal Documentation Boundary
+
+## These belong in the repo (public):
+- Source code, tests, build config
+- **Decision records** — `ADR-NNN` in `docs/decisions/` (the repo's design-record)
+- Methodology documents (`methodology/`)
+- Example files (`examples/`) — with fabricated data only
+- User-facing docs: README, GETTING_STARTED, API, CONTRIBUTING, CHANGELOG
+- Public design specs (`specs/`) — if intentionally public
+- Templates (`templates/`)
+
+## These do NOT belong in the repo (→ project's Nextcloud internal-documentation):
+- Handoff documents (`type: handoff`, HANDOFF_*, *_HANDOFF_*)
+- **Ideas** (`type: idea`, `docs/ideas/`) — quick internal captures
+- **RFCs** (`type: rfc`) — design proposals; internal until ratified, then they become an ADR in `docs/decisions/`
+- **Explorations and shapes** (`type: exploration` / `type: shape`) — strategic deliberation
+- Internal planning docs / plans (CODE_HANDOFF_*)
+- Development notes (`notes/`, `_internal/`)
+- Chat history or session exports
+- Process memos from actual research projects
+- Files from Nextcloud, Dropbox, OneDrive or other external sync services
+
+## If you are about to create or edit a file:
+- Is this something a user who clones the repo needs? → Repo
+- Is this internal planning, handoff, or development thinking? → NOT repo (use `save_document(doc_type=…)` → Nextcloud)
+
+## The document model (ratified 2026-06-21)
+Repo = **ADR** (the ratified decision-record) + code / methodology / examples / user-docs. ALL
+deliberation (idea / rfc / exploration / shape / handoff) is **internal** → the project's Nextcloud
+`<Project>_internal_documentation/` (routed via `save_document(doc_type=…)`). Enforced deterministically
+by `internal_docs_guard` (gate on a doc's frontmatter `type:`): unambiguously-internal types are blocked
+from the repo (git pre-commit), optional-public (rfc/plan/todo/spec) only warn.
diff --git a/.claude/rules/publish-readiness.md b/.claude/rules/publish-readiness.md
new file mode 100644
index 0000000..d333b31
--- /dev/null
+++ b/.claude/rules/publish-readiness.md
@@ -0,0 +1,106 @@
+---
+paths:
+  - "**/*"
+---
+
+# Publish Readiness — Pre-publish checklist
+
+This rule defines what makes a repository ready to flip from private to public. Used by the `/publish-check` slash-command.
+
+## Severity rubric
+
+### Blocker — MUST fix before public
+
+A finding that, if left in, exposes personal data, breaks user trust, or makes the repository misleading. Public-flip is unsafe until resolved.
+
+Examples:
+- Personal names, hardcoded user-home paths, e-mail addresses
+- README claims that contradict source code (false advertising)
+
+### Warning — SHOULD fix before public
+
+A finding that signals carelessness or inconsistency to public readers. Public-flip is possible but degrades reception.
+
+Examples:
+- British-English drift in user-facing prose
+- Missing community files referenced from README
+- Outdated supported-versions in `SECURITY.md`
+
+### Nice-to-have — MAY add or improve
+
+A finding that, if added, increases professionalism but is not expected by readers.
+
+Examples:
+- `CODE_OF_CONDUCT.md`
+- `.github/dependabot.yml`
+- Issue / pull-request templates
+
+## Scan axes
+
+The `/publish-check` command runs five scans:
+
+1. **data-protection** — sources truth from `data-protection.md` rule (user-home paths, personal names, e-mail addresses)
+2. **language** — sources truth from `language-british-english.md` rule (American-English drift in prose)
+3. **docs-freshness** — `README`, `ROADMAP`, `SECURITY` versions and counts vs the source (`package.json`, `src/`)
+4. **release-hygiene** — community files exist and are current
+5. **readme-sections** — README carries the golden-standard mandatory sections (see "README golden standard" below); sources truth from `readme_check.py`
+
+## README golden standard
+
+Every project README must clear one bar: **a newcomer understands what the project is
+within the first ~15 lines** — the situation in plain language, before any architecture,
+philosophy or jargon, defining terms the first time they are used. Complete structure is
+not enough; comprehension is the test.
+
+The canonical template lives in ACDM at `templates/README.template.md`. It is a *reference*,
+**not** seeded into projects (per ADR-016 + the doc-model: `templates/` is a repo-side
+artifact ACDM owns; `init_project` distributes enforcement, not content scaffold). Copy its
+structure when writing or revising a README.
+
+**Mandatory sections** (enforced by Scan 5 / `readme_check.py`):
+
+- **What is `<Project>`?** — the plain-language on-ramp.
+- **Development status** (or **Status and maturity**) — honest maturity; early publication
+  is fine, overclaiming is not.
+- **Data & privacy** — mandatory *only* when the tool touches personal data (human
+  judgement; deliberately not auto-checked).
+
+Recommended (not auto-enforced): ecosystem block (if part of a family), "who is this for?"
+doors, how it works, Documentation, Requirements, Licence, Support, Acknowledgements. See
+the template for the full shape and per-section guidance.
+
+## Out of scope (v1)
+
+- Auto-fix (report-only)
+- Continuous-integration enforcement
+- Pre-commit hook integration
+- Security review (`/security-review` — separate skill)
+- Code-quality review (`/simplify` — separate skill)
+- README *quality* / textual review — does the prose actually communicate? (manual pass, or the `doc-reviewer` agent; Scan 5 checks section *presence*, not quality)
+- INSTALL / LICENSE textual review (manual pass required)
+- Version-bump decisions (project-internal)
+
+These are documented as v2 promotions or out-of-tool concerns.
+
+## Consuming the report safely
+
+When `/publish-check` produces findings and you start fixing them:
+
+1. **Verify the working tree is clean first** — `git status` shows no untracked or modified files you didn't expect. After `init_project(update=True)` the disk holds new files not yet visible to git.
+2. **Stage explicit per finding** — `git add <file>`, not `git add -A`. The `-A` form picks up unrelated upstream drift.
+3. **Verify the diff per file before commit** — `git diff --staged <file>`.
+4. **Be extra careful immediately after `init_project --update`** — distributed templates may overwrite earlier per-project fixes; the report you ran against may not reflect the disk state.
+
+Context: Teacher_MCP PR #61 (2026-05-05) used `git add -A` against undetected upstream drift and introduced 2 new BE-drift findings while fixing 9. Explicit staging would have prevented this.
+
+## Building the public artifact (fresh-repo flips)
+
+When the flip strategy is a fresh repository (no carried-over git history), the published repo is *built* from the working tree through an include/exclude step. Three principles keep that build trustworthy:
+
+1. **Verify the built artifact, not the working tree.** A scan against the source tree never tests the include/exclude list itself — a file the list fails to exclude still sits in the tree the scan passed. Build the fresh repo into a staging location, then run the publish scans against *that*, before publishing. The artifact is what readers get; the artifact is what you verify. (This is distinct from "verify the working tree is clean" above: that guards the fixing step; this guards the published output.)
+2. **Allowlist what ships; do not denylist what doesn't.** Start the fresh repo from empty and copy in only named paths. A denylist (copy everything, minus exclusions) fails open — anything you forget to list is published. An allowlist fails closed.
+3. **Run the gate as a reproducible script against a committed checkpoint.** A single ad-hoc grep pass is not a gate — globs and mounts misfire silently. Commit the prep work to the still-private branch first, so there is an auditable diff and a stable state to build from, then run the scan as a script the human can re-run.
+
+A token grep (names, course codes) finds known strings; it cannot find sensitive content that lacks them (personal reflections, opinions about colleagues, self-flagged private documents). Where shipping files carry a privacy field in front-matter (e.g. `privacy: private`), treat that field — not a name grep — as the primary ship / no-ship filter.
+
+Context: the Teacher_MCP private→public flip (2026-05-27) scanned the working tree before the fresh-repo build, leaving the include/exclude list unverified; an independent grep pass misfired (wrong glob, slow mount) before being corrected; and a self-flagged `privacy: private` document was caught only by chance through a name grep.
diff --git a/.gitignore b/.gitignore
index 49d9b65..f071327 100644
--- a/.gitignore
+++ b/.gitignore
@@ -66,7 +66,15 @@ coverage/
 .env.local
 *.local
 
-# ACDM-specific (personal tooling, not project content)
-.claude/
+# ACDM config — gitignored (local / path-sensitive). EXCEPT the repo-policy rules
+# below, which describe the repo / are read by repo-tooling and must travel with a
+# clone (ADR-015 "protection travels with the repo"). Process rules + acdm.json +
+# .mcp.json + CLAUDE.md stay ignored.
+.claude/*
+!.claude/rules/
+.claude/rules/*
+!.claude/rules/data-protection.md
+!.claude/rules/internal-docs-boundary.md
+!.claude/rules/publish-readiness.md
 .mcp.json
 CLAUDE.md