CreatmanCEO · CreatmanCEO · Apr 30, 2026 · Apr 30, 2026
diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml
@@ -0,0 +1,111 @@
+name: Validate
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  validate:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: LICENSE exists
+        run: test -s LICENSE || (echo "::error::LICENSE missing or empty" && exit 1)
+
+      - name: CHANGELOG.md exists
+        run: test -s CHANGELOG.md || (echo "::error::CHANGELOG.md missing or empty" && exit 1)
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - name: Python compile check
+        run: |
+          set -e
+          fail=0
+          for f in $(git ls-files '*.py'); do
+            if ! python -m py_compile "$f"; then
+              echo "::error file=$f::python compile error"
+              fail=1
+            fi
+          done
+          exit $fail
+
+      - name: Bash syntax check on .sh files
+        run: |
+          set -e
+          fail=0
+          for f in $(git ls-files '*.sh'); do
+            if ! bash -n "$f"; then
+              echo "::error file=$f::bash syntax error"
+              fail=1
+            fi
+          done
+          exit $fail
+
+      - name: ShellCheck (error severity)
+        # Severity 'error' only — style warnings (SC1090, SC2034, SC2155 etc.)
+        # are not gated here; track them separately if desired.
+        uses: ludeeus/action-shellcheck@master
+        with:
+          severity: error
+          scandir: '.'
+
+      - name: Every command file has a heading and a parameter section
+        run: |
+          set -e
+          fail=0
+          for f in $(git ls-files 'commands/*.md'); do
+            if ! head -5 "$f" | grep -qE '^# /'; then
+              echo "::error file=$f::missing '# /command' heading on first line"
+              fail=1
+            fi
+          done
+          exit $fail
+
+      - name: SVG files are well-formed XML
+        run: |
+          set -e
+          fail=0
+          for f in $(git ls-files 'docs/*.svg' '*.svg'); do
+            if ! python -c "import xml.etree.ElementTree as ET; ET.parse('$f')" 2>/dev/null; then
+              echo "::error file=$f::malformed SVG XML"
+              fail=1
+            fi
+          done
+          exit $fail
+
+      - name: All docs/* assets referenced from README exist
+        run: |
+          set -e
+          fail=0
+          for ref in $(grep -hoE 'docs/[a-zA-Z0-9_/-]+\.(svg|png|jpg|jpeg|gif)' README.md README.ru.md | sort -u); do
+            if [ ! -f "$ref" ]; then
+              echo "::error file=README.md::missing referenced asset $ref"
+              fail=1
+            fi
+          done
+          exit $fail
+
+      - name: Internal Markdown links resolve
+        run: |
+          set -e
+          fail=0
+          for src in README.md README.ru.md CHANGELOG.md CONTRIBUTING.md CLAUDE.md; do
+            [ -f "$src" ] || continue
+            base="$(dirname "$src")"
+            for tgt in $(grep -hoE '\]\([^)]+\)' "$src" | sed 's/](\(.*\))/\1/' | sed 's/#.*$//'); do
+              case "$tgt" in
+                http*|mailto:*|"") continue ;;
+              esac
+              [ "$base" = "." ] && resolved="$tgt" || resolved="$base/$tgt"
+              if [ ! -e "$resolved" ] && [ ! -e "$tgt" ]; then
+                echo "::error file=$src::broken internal link → $tgt"
+                fail=1
+              fi
+            done
+          done
+          exit $fail
diff --git a/.gitignore b/.gitignore
@@ -4,3 +4,4 @@
 __pycache__/
 .DS_Store
 Thumbs.db
+*.pyc
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,52 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) · [SemVer](https://semver.org/spec/v2.0.0.html).
+
+## [0.2.0] — 2026-04-30
+
+### Added
+
+- `docs/architecture.svg` — pipeline diagram showing how a single slash command (e.g. `/research`) drives a deterministic three-phase recipe (Collect → Analyse → Artifacts) over the upstream `notebooklm-mcp-cli` MCP tools.
+- `docs/output-mockup.svg` — visual mock-up of what a typical `/research` response looks like: structured findings / patterns / contradictions with inline citations linking to NotebookLM sources.
+- `CLAUDE.md` for this repository — Level 1 file documenting the architecture, key files, CRITICAL RULES, commands, patterns. Pairs with the [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy) sister repo.
+- `CHANGELOG.md` (this file)
+- `CONTRIBUTING.md` with a priority list for community submissions
+- `.github/workflows/validate.yml` — CI that runs `bash -n` on shell scripts, `python -m py_compile` on Python scripts, ShellCheck (severity error), confirms every `docs/*` asset referenced from README exists, and validates that all internal Markdown links resolve from `README.md` / `README.ru.md` / `CHANGELOG.md` / `CONTRIBUTING.md` / `CLAUDE.md`
+- `Limitations` section to both READMEs — cookie expiry rhythm, 500 K word source cap, `/edit-source` workaround caveat, forum-detection heuristic, opaque NotebookLM rate limits, Claude Code-only, Windows-only native notifications, upstream-MCP dependency
+- `Measured impact` section with concrete numbers (research-pipeline tool-call savings, 41 K-message Telegram forum tested in <2 min, YouTube 429 workaround, 30+ frameworks, daily auth check)
+- `When to use which research command` decision helper distinguishing `/research`, `/deep-research`, `/youtube-research`, `/telegram-to-notebook`
+- `Related` cross-links to all three sister repos: [claude-code-antiregression-setup](https://github.com/CreatmanCEO/claude-code-antiregression-setup), [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy), [claude-statusline](https://github.com/CreatmanCEO/claude-statusline)
+- Six new badges: License, Stars, Validate CI, Built on `notebooklm-mcp-cli`, Claude Code Opus 4.7, MCP-compatible
+
+### Changed
+
+- README hero rewritten to lead with concrete production proof (41 K-message Telegram, 30+ frameworks, 7 commands) instead of an abstract feature list
+- The flagship value-prop quote (*"MCP server = Claude has hands · workflow commands = Claude has hands + a checklist"*) elevated to a callout under the hero
+- Project structure tree now matches the actual filesystem (was missing `docs/`, `CHANGELOG.md`, `CONTRIBUTING.md`, `CLAUDE.md`, `.github/workflows/`)
+- Author signature expanded with Habr / dev.to profile links
+
+### Notes
+
+- Topics on GitHub applied separately via `gh api` after merge.
+- No companion article published yet for this repo specifically. Tracked as a P3 follow-up: a Habr / dev.to article along the lines of *"How I turned NotebookLM into a 7-command research assistant for Claude Code"* is the natural next traffic-driver, mirroring what was done for [claude-code-antiregression-setup](https://habr.com/ru/articles/1013330/) and [claude-statusline](https://habr.com/ru/articles/1013414/).
+
+## [0.1.0] — 2026-04-05
+
+### Added
+
+- Initial release with five core slash commands:
+  - `/research` — full research pipeline with auto-expand and Obsidian export
+  - `/deep-research` — multi-iteration deep dive with topic tree
+  - `/youtube-research` — YouTube video analysis via NotebookLM (workaround for HTTP 429 on transcript APIs)
+  - `/init-notebook` — auto-create a docs notebook for a tech stack, with URL hints for 30+ popular frameworks
+  - `/telegram-to-notebook` — import Telegram exports including forum supergroups with topic detection
+- Two additional commands:
+  - `/analytics-report` — analytics data → NotebookLM analysis → infographic / report
+  - `/edit-source` — workaround for editing NotebookLM sources (extract → edit → replace)
+- `scripts/telegram-chunker.py` — Python utility for splitting Telegram JSON exports into NotebookLM-compatible chunks. Handles forum-supergroup topic detection, filters stickers / GIFs / video, keeps text / code / PDFs. Tested on a 41 K-message corpus (12 topics, 586 K words → 13 NotebookLM sources, under 2 minutes).
+- `scripts/nlm-auth-check.sh` — daily auth probe with Windows toast notification (BurntToast preferred, MessageBox fallback)
+- `scripts/setup-nlm-scheduler.ps1` — one-click Windows Task Scheduler installer
+- `config/CLAUDE.md` — global instruction snippet teaching Claude to proactively use NotebookLM for unfamiliar libraries
+- Bilingual README (English + Russian)
+- MIT license
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,66 @@
+# notebooklm-claude-workflows — CLAUDE.md (Level 1)
+
+> Level 1 file for **this** repository. (`config/CLAUDE.md` is a snippet shipped to *downstream users* — different file, different audience.)
+
+## Status: ACTIVE
+Public Claude Code slash-command pack + automation for Google NotebookLM. MIT-licensed. Built on top of [`notebooklm-mcp-cli`](https://github.com/jacob-bd/notebooklm-mcp-cli) — no functionality without it.
+
+## Architecture
+
+The whole project is a workflow layer over a third-party MCP server. There is no runtime of our own:
+
+- `commands/*.md` — Claude Code slash-command recipes. Each is a deterministic 2–3-phase pipeline of MCP tool calls (`notebook_create`, `source_add`, `notebook_query`, `studio_create`, `research_start`, `research_status`, `research_import`).
+- `scripts/telegram-chunker.py` — Python utility for one specific pain point: turning Telegram forum-supergroup exports into NotebookLM-sized markdown chunks. Detects topics via `topic_message_id` + `forum_topic_created`. Output is plain markdown ready for `source_add`.
+- `scripts/nlm-auth-check.sh` — Bash daily auth probe. Calls `nlm notebook list`, parses for `"id"` keys, logs to `~/Documents/scripts/nlm-auth.log`. On Windows, fires a BurntToast notification (with MessageBox fallback) when auth has expired.
+- `scripts/setup-nlm-scheduler.ps1` — PowerShell installer that registers the auth probe as a Windows Task Scheduler job.
+- `config/CLAUDE.md` — content fragment to be appended to a downstream user's `~/.claude/CLAUDE.md`. Teaches Claude to proactively use NotebookLM for unfamiliar libraries.
+
+## Key files (when touching)
+
+- `commands/research.md` — three-phase pipeline. The order matters: Phase 1 sources, Phase 2 queries, Phase 3 artifacts. Do not collapse phases.
+- `commands/deep-research.md` — five-question-per-topic deep dive. Builds a topic tree first, then iterates. Distinct from `research.md`; both should remain.
+- `commands/telegram-to-notebook.md` — references `scripts/telegram-chunker.py`. Keep paths consistent if you move the script.
+- `scripts/telegram-chunker.py` — 13 KB, pure stdlib. No third-party deps by design (so it runs anywhere Python 3.10+ is available).
+- `scripts/nlm-auth-check.sh` — `bash`-portable, no `set -euo pipefail` because the failure path is the whole point. If you add `set -e`, the toast-notification logic stops working when `nlm notebook list` fails.
+
+## CRITICAL RULES — when editing this repo
+
+- **NEVER** introduce a hard dependency on a specific `notebooklm-mcp-cli` version unless you also pin it in README and `CHANGELOG.md`. Upstream is third-party — pinning is a real cost.
+- **NEVER** silently drop the `wait=true` semantics in any command. The whole "deterministic vs raw MCP" value prop hinges on `wait=true` being applied consistently.
+- **NEVER** rename a slash command without updating the README's command tables (English AND Russian) and adding a `CHANGELOG.md` "Breaking changes" entry. Users have shell aliases / scripts referencing the names.
+- **ALWAYS** mirror customer-facing changes between `README.md` and `README.ru.md`. They have feature parity and must stay parity.
+- **ALWAYS** update `CHANGELOG.md` for any change visible to users (new command, removed command, output-format change, breaking workflow change).
+- **ALWAYS** keep `scripts/telegram-chunker.py` stdlib-only. The whole point is "drop-in script with zero install"; adding `pip install` requirements breaks that contract.
+- **ALWAYS** flag commands that depend on a not-yet-released MCP tool with a clear note in the command file ("requires notebooklm-mcp-cli ≥ X.Y").
+
+## Commands (for me)
+
+- `python -m py_compile scripts/telegram-chunker.py` — syntax check
+- `bash -n scripts/nlm-auth-check.sh` — syntax check
+- `python scripts/telegram-chunker.py result.json --list-topics` — quick smoke test against a real export
+- `nlm notebook list` — confirm auth still valid before running any command file by hand
+
+## Key patterns
+
+1. **Three-phase pipeline.** Every research command follows Collect → Analyse → Artifacts. Distinct phases let Claude show progress and let users abort cleanly between phases.
+2. **`wait=true` discipline.** Every `source_add` call uses `wait=true` so subsequent `notebook_query` calls actually have content to query. This is invisible discipline that distinguishes "raw MCP" from "workflow command".
+3. **Sequential queries with `conversation_id`.** Multi-question deep-research uses the same `conversation_id` across queries so NotebookLM keeps context. Drop this and the analyses become decontextualised.
+4. **Filter at chunk boundaries.** `telegram-chunker.py` filters stickers / GIFs / video at message-extract time, not at chunk-write time, so chunk word counts reflect actual signal.
+
+## External dependencies
+
+- [`notebooklm-mcp-cli`](https://github.com/jacob-bd/notebooklm-mcp-cli) — required. If upstream breaks, every command in this repo breaks.
+- Claude Code (Opus 4.7 / 1M context recommended for `/deep-research`).
+- Python 3.10+ for `telegram-chunker.py`.
+- Bash 4+ and `nlm` CLI on PATH for `nlm-auth-check.sh`.
+- Windows + PowerShell for `setup-nlm-scheduler.ps1` and BurntToast notifications.
+
+## Sister repos (same author)
+
+- [Claude Code Anti-Regression Setup](https://github.com/CreatmanCEO/claude-code-antiregression-setup) — pairs with this: anti-regression keeps Claude from breaking code while running these workflows.
+- [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy) — `config/CLAUDE.md` here is a Level 0 fragment that fits naturally into that hierarchy.
+- [claude-statusline](https://github.com/CreatmanCEO/claude-statusline) — Claude Code statusline; complementary tool from the same ecosystem.
+
+## Recent changes
+
+See [CHANGELOG.md](CHANGELOG.md).
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,44 @@
+# Contributing
+
+Thanks for considering a contribution. The bar: real use case, deterministic workflow, MCP-tool calls visible in the command file, no hidden dependencies.
+
+## Priorities (highest impact first)
+
+1. **Native notifications for Linux and macOS** — `scripts/nlm-auth-check.sh` currently only fires Windows toast notifications. Add `osascript -e 'display notification ...'` for macOS and `notify-send` (with `dunstify` fallback) for Linux. Detect the platform via `uname -s` and branch.
+2. **Expanded URL dictionary in `/init-notebook`** — current list covers 30+ frameworks. PRs welcome to extend with: Solid.js, SvelteKit, Astro, Hono, NestJS, Bun, Deno, tRPC, TanStack Query, Zod, Pydantic, SQLAlchemy, Polars, DuckDB, ClickHouse, Temporal, Hatchet, Convex.
+3. **New slash commands** that map cleanly onto the three-phase pattern:
+   - `/pdf-research <topic>` — PDF papers → NotebookLM → analysis
+   - `/podcast-to-notebook <feed-url>` — podcast feed → episode transcripts (via NotebookLM) → topic-aware analysis
+   - `/csv-research <file>` — CSV / TSV / Parquet → analytics queries against the data
+4. **Translation of command files** — currently command bodies are bilingual where it matters but a few are Russian-only. Extract user-visible strings into a small translation table or duplicate each command with `.en.md` / `.ru.md` suffix.
+5. **Telegram chunker improvements** — current chunker is forum-aware, but could also handle:
+   - WhatsApp exports (different format)
+   - Discord channel exports (also has thread structure)
+   - Slack channel exports (per-channel JSON)
+
+## What we will not merge
+
+- Changes that introduce a non-stdlib dependency in `scripts/telegram-chunker.py`. The whole point is "drop-in script, zero install." If you need a non-stdlib lib, write a sister script.
+- Changes that bypass `wait=true` semantics in command files. The deterministic workflow contract depends on it.
+- Slash commands that wrap a single MCP tool call with no added value. If it is one-call, it is not a workflow — the user should call the MCP tool directly.
+- Pull requests that rename existing commands without a CHANGELOG entry under "Breaking changes" and a deprecation alias kept for at least one minor version.
+
+## Pull request checklist
+
+- [ ] If you added a slash command: it has a clear `## Phase 1` / `## Phase 2` / `## Phase 3` structure (or a documented reason for fewer)
+- [ ] If you touched `scripts/telegram-chunker.py`: `python -m py_compile` clean, stdlib-only, `--list-topics` smoke test passed locally
+- [ ] If you touched `scripts/nlm-auth-check.sh`: `bash -n` clean, ShellCheck severity `error` clean
+- [ ] `README.md` updated AND `README.ru.md` mirrored
+- [ ] `CHANGELOG.md` entry added under Unreleased or a new minor version
+- [ ] `validate.yml` workflow passes locally
+
+## Style
+
+- Command files: imperative voice (*"Create a notebook"*, *"Add each source"*). Phase headings are bold and numbered.
+- Bash: `bash -n` and ShellCheck-error clean. `[[ ]]` over `[ ]`, `$()` over backticks, quoted variable expansions.
+- Python: stdlib only, type hints on public functions, no print debugging in committed code.
+- One feature per PR. Stack PRs if you have multiple.
+
+## Author / maintainer
+
+[@CreatmanCEO](https://github.com/CreatmanCEO) — Nick Podolyak. Open an issue first for anything larger than one command file or one chunker improvement.