Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
name: Validate

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: LICENSE exists
run: test -s LICENSE || (echo "::error::LICENSE missing or empty" && exit 1)

- name: CHANGELOG.md exists
run: test -s CHANGELOG.md || (echo "::error::CHANGELOG.md missing or empty" && exit 1)

- uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Python compile check
run: |
set -e
fail=0
for f in $(git ls-files '*.py'); do
if ! python -m py_compile "$f"; then
echo "::error file=$f::python compile error"
fail=1
fi
done
exit $fail

- name: Bash syntax check on .sh files
run: |
set -e
fail=0
for f in $(git ls-files '*.sh'); do
if ! bash -n "$f"; then
echo "::error file=$f::bash syntax error"
fail=1
fi
done
exit $fail

- name: ShellCheck (error severity)
# Severity 'error' only — style warnings (SC1090, SC2034, SC2155 etc.)
# are not gated here; track them separately if desired.
uses: ludeeus/action-shellcheck@master
with:
severity: error
scandir: '.'

- name: Every command file has a heading and a parameter section
run: |
set -e
fail=0
for f in $(git ls-files 'commands/*.md'); do
if ! head -5 "$f" | grep -qE '^# /'; then
echo "::error file=$f::missing '# /command' heading on first line"
fail=1
fi
done
exit $fail

- name: SVG files are well-formed XML
run: |
set -e
fail=0
for f in $(git ls-files 'docs/*.svg' '*.svg'); do
if ! python -c "import xml.etree.ElementTree as ET; ET.parse('$f')" 2>/dev/null; then
echo "::error file=$f::malformed SVG XML"
fail=1
fi
done
exit $fail

- name: All docs/* assets referenced from README exist
run: |
set -e
fail=0
for ref in $(grep -hoE 'docs/[a-zA-Z0-9_/-]+\.(svg|png|jpg|jpeg|gif)' README.md README.ru.md | sort -u); do
if [ ! -f "$ref" ]; then
echo "::error file=README.md::missing referenced asset $ref"
fail=1
fi
done
exit $fail

- name: Internal Markdown links resolve
run: |
set -e
fail=0
for src in README.md README.ru.md CHANGELOG.md CONTRIBUTING.md CLAUDE.md; do
[ -f "$src" ] || continue
base="$(dirname "$src")"
for tgt in $(grep -hoE '\]\([^)]+\)' "$src" | sed 's/](\(.*\))/\1/' | sed 's/#.*$//'); do
case "$tgt" in
http*|mailto:*|"") continue ;;
esac
[ "$base" = "." ] && resolved="$tgt" || resolved="$base/$tgt"
if [ ! -e "$resolved" ] && [ ! -e "$tgt" ]; then
echo "::error file=$src::broken internal link → $tgt"
fail=1
fi
done
done
exit $fail
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
__pycache__/
.DS_Store
Thumbs.db
*.pyc
52 changes: 52 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Changelog

All notable changes to this project will be documented in this file.
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) · [SemVer](https://semver.org/spec/v2.0.0.html).

## [0.2.0] — 2026-04-30

### Added

- `docs/architecture.svg` — pipeline diagram showing how a single slash command (e.g. `/research`) drives a deterministic three-phase recipe (Collect → Analyse → Artifacts) over the upstream `notebooklm-mcp-cli` MCP tools.
- `docs/output-mockup.svg` — visual mock-up of what a typical `/research` response looks like: structured findings / patterns / contradictions with inline citations linking to NotebookLM sources.
- `CLAUDE.md` for this repository — Level 1 file documenting the architecture, key files, CRITICAL RULES, commands, patterns. Pairs with the [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy) sister repo.
- `CHANGELOG.md` (this file)
- `CONTRIBUTING.md` with a priority list for community submissions
- `.github/workflows/validate.yml` — CI that runs `bash -n` on shell scripts, `python -m py_compile` on Python scripts, ShellCheck (severity error), confirms every `docs/*` asset referenced from README exists, and validates that all internal Markdown links resolve from `README.md` / `README.ru.md` / `CHANGELOG.md` / `CONTRIBUTING.md` / `CLAUDE.md`
- `Limitations` section to both READMEs — cookie expiry rhythm, 500 K word source cap, `/edit-source` workaround caveat, forum-detection heuristic, opaque NotebookLM rate limits, Claude Code-only, Windows-only native notifications, upstream-MCP dependency
- `Measured impact` section with concrete numbers (research-pipeline tool-call savings, 41 K-message Telegram forum tested in <2 min, YouTube 429 workaround, 30+ frameworks, daily auth check)
- `When to use which research command` decision helper distinguishing `/research`, `/deep-research`, `/youtube-research`, `/telegram-to-notebook`
- `Related` cross-links to all three sister repos: [claude-code-antiregression-setup](https://github.com/CreatmanCEO/claude-code-antiregression-setup), [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy), [claude-statusline](https://github.com/CreatmanCEO/claude-statusline)
- Six new badges: License, Stars, Validate CI, Built on `notebooklm-mcp-cli`, Claude Code Opus 4.7, MCP-compatible

### Changed

- README hero rewritten to lead with concrete production proof (41 K-message Telegram, 30+ frameworks, 7 commands) instead of an abstract feature list
- The flagship value-prop quote (*"MCP server = Claude has hands · workflow commands = Claude has hands + a checklist"*) elevated to a callout under the hero
- Project structure tree now matches the actual filesystem (was missing `docs/`, `CHANGELOG.md`, `CONTRIBUTING.md`, `CLAUDE.md`, `.github/workflows/`)
- Author signature expanded with Habr / dev.to profile links

### Notes

- Topics on GitHub applied separately via `gh api` after merge.
- No companion article published yet for this repo specifically. Tracked as a P3 follow-up: a Habr / dev.to article along the lines of *"How I turned NotebookLM into a 7-command research assistant for Claude Code"* is the natural next traffic-driver, mirroring what was done for [claude-code-antiregression-setup](https://habr.com/ru/articles/1013330/) and [claude-statusline](https://habr.com/ru/articles/1013414/).

## [0.1.0] — 2026-04-05

### Added

- Initial release with five core slash commands:
- `/research` — full research pipeline with auto-expand and Obsidian export
- `/deep-research` — multi-iteration deep dive with topic tree
- `/youtube-research` — YouTube video analysis via NotebookLM (workaround for HTTP 429 on transcript APIs)
- `/init-notebook` — auto-create a docs notebook for a tech stack, with URL hints for 30+ popular frameworks
- `/telegram-to-notebook` — import Telegram exports including forum supergroups with topic detection
- Two additional commands:
- `/analytics-report` — analytics data → NotebookLM analysis → infographic / report
- `/edit-source` — workaround for editing NotebookLM sources (extract → edit → replace)
- `scripts/telegram-chunker.py` — Python utility for splitting Telegram JSON exports into NotebookLM-compatible chunks. Handles forum-supergroup topic detection, filters stickers / GIFs / video, keeps text / code / PDFs. Tested on a 41 K-message corpus (12 topics, 586 K words → 13 NotebookLM sources, under 2 minutes).
- `scripts/nlm-auth-check.sh` — daily auth probe with Windows toast notification (BurntToast preferred, MessageBox fallback)
- `scripts/setup-nlm-scheduler.ps1` — one-click Windows Task Scheduler installer
- `config/CLAUDE.md` — global instruction snippet teaching Claude to proactively use NotebookLM for unfamiliar libraries
- Bilingual README (English + Russian)
- MIT license
66 changes: 66 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# notebooklm-claude-workflows — CLAUDE.md (Level 1)

> Level 1 file for **this** repository. (`config/CLAUDE.md` is a snippet shipped to *downstream users* — different file, different audience.)

## Status: ACTIVE
Public Claude Code slash-command pack + automation for Google NotebookLM. MIT-licensed. Built on top of [`notebooklm-mcp-cli`](https://github.com/jacob-bd/notebooklm-mcp-cli) — no functionality without it.

## Architecture

The whole project is a workflow layer over a third-party MCP server. There is no runtime of our own:

- `commands/*.md` — Claude Code slash-command recipes. Each is a deterministic 2–3-phase pipeline of MCP tool calls (`notebook_create`, `source_add`, `notebook_query`, `studio_create`, `research_start`, `research_status`, `research_import`).
- `scripts/telegram-chunker.py` — Python utility for one specific pain point: turning Telegram forum-supergroup exports into NotebookLM-sized markdown chunks. Detects topics via `topic_message_id` + `forum_topic_created`. Output is plain markdown ready for `source_add`.
- `scripts/nlm-auth-check.sh` — Bash daily auth probe. Calls `nlm notebook list`, parses for `"id"` keys, logs to `~/Documents/scripts/nlm-auth.log`. On Windows, fires a BurntToast notification (with MessageBox fallback) when auth has expired.
- `scripts/setup-nlm-scheduler.ps1` — PowerShell installer that registers the auth probe as a Windows Task Scheduler job.
- `config/CLAUDE.md` — content fragment to be appended to a downstream user's `~/.claude/CLAUDE.md`. Teaches Claude to proactively use NotebookLM for unfamiliar libraries.

## Key files (when touching)

- `commands/research.md` — three-phase pipeline. The order matters: Phase 1 sources, Phase 2 queries, Phase 3 artifacts. Do not collapse phases.
- `commands/deep-research.md` — five-question-per-topic deep dive. Builds a topic tree first, then iterates. Distinct from `research.md`; both should remain.
- `commands/telegram-to-notebook.md` — references `scripts/telegram-chunker.py`. Keep paths consistent if you move the script.
- `scripts/telegram-chunker.py` — 13 KB, pure stdlib. No third-party deps by design (so it runs anywhere Python 3.10+ is available).
- `scripts/nlm-auth-check.sh` — `bash`-portable, no `set -euo pipefail` because the failure path is the whole point. If you add `set -e`, the toast-notification logic stops working when `nlm notebook list` fails.

## CRITICAL RULES — when editing this repo

- **NEVER** introduce a hard dependency on a specific `notebooklm-mcp-cli` version unless you also pin it in README and `CHANGELOG.md`. Upstream is third-party — pinning is a real cost.
- **NEVER** silently drop the `wait=true` semantics in any command. The whole "deterministic vs raw MCP" value prop hinges on `wait=true` being applied consistently.
- **NEVER** rename a slash command without updating the README's command tables (English AND Russian) and adding a `CHANGELOG.md` "Breaking changes" entry. Users have shell aliases / scripts referencing the names.
- **ALWAYS** mirror customer-facing changes between `README.md` and `README.ru.md`. They have feature parity and must stay parity.
- **ALWAYS** update `CHANGELOG.md` for any change visible to users (new command, removed command, output-format change, breaking workflow change).
- **ALWAYS** keep `scripts/telegram-chunker.py` stdlib-only. The whole point is "drop-in script with zero install"; adding `pip install` requirements breaks that contract.
- **ALWAYS** flag commands that depend on a not-yet-released MCP tool with a clear note in the command file ("requires notebooklm-mcp-cli ≥ X.Y").

## Commands (for me)

- `python -m py_compile scripts/telegram-chunker.py` — syntax check
- `bash -n scripts/nlm-auth-check.sh` — syntax check
- `python scripts/telegram-chunker.py result.json --list-topics` — quick smoke test against a real export
- `nlm notebook list` — confirm auth still valid before running any command file by hand

## Key patterns

1. **Three-phase pipeline.** Every research command follows Collect → Analyse → Artifacts. Distinct phases let Claude show progress and let users abort cleanly between phases.
2. **`wait=true` discipline.** Every `source_add` call uses `wait=true` so subsequent `notebook_query` calls actually have content to query. This is invisible discipline that distinguishes "raw MCP" from "workflow command".
3. **Sequential queries with `conversation_id`.** Multi-question deep-research uses the same `conversation_id` across queries so NotebookLM keeps context. Drop this and the analyses become decontextualised.
4. **Filter at chunk boundaries.** `telegram-chunker.py` filters stickers / GIFs / video at message-extract time, not at chunk-write time, so chunk word counts reflect actual signal.

## External dependencies

- [`notebooklm-mcp-cli`](https://github.com/jacob-bd/notebooklm-mcp-cli) — required. If upstream breaks, every command in this repo breaks.
- Claude Code (Opus 4.7 / 1M context recommended for `/deep-research`).
- Python 3.10+ for `telegram-chunker.py`.
- Bash 4+ and `nlm` CLI on PATH for `nlm-auth-check.sh`.
- Windows + PowerShell for `setup-nlm-scheduler.ps1` and BurntToast notifications.

## Sister repos (same author)

- [Claude Code Anti-Regression Setup](https://github.com/CreatmanCEO/claude-code-antiregression-setup) — pairs with this: anti-regression keeps Claude from breaking code while running these workflows.
- [ai-context-hierarchy](https://github.com/CreatmanCEO/ai-context-hierarchy) — `config/CLAUDE.md` here is a Level 0 fragment that fits naturally into that hierarchy.
- [claude-statusline](https://github.com/CreatmanCEO/claude-statusline) — Claude Code statusline; complementary tool from the same ecosystem.

## Recent changes

See [CHANGELOG.md](CHANGELOG.md).
44 changes: 44 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Contributing

Thanks for considering a contribution. The bar: real use case, deterministic workflow, MCP-tool calls visible in the command file, no hidden dependencies.

## Priorities (highest impact first)

1. **Native notifications for Linux and macOS** — `scripts/nlm-auth-check.sh` currently only fires Windows toast notifications. Add `osascript -e 'display notification ...'` for macOS and `notify-send` (with `dunstify` fallback) for Linux. Detect the platform via `uname -s` and branch.
2. **Expanded URL dictionary in `/init-notebook`** — current list covers 30+ frameworks. PRs welcome to extend with: Solid.js, SvelteKit, Astro, Hono, NestJS, Bun, Deno, tRPC, TanStack Query, Zod, Pydantic, SQLAlchemy, Polars, DuckDB, ClickHouse, Temporal, Hatchet, Convex.
3. **New slash commands** that map cleanly onto the three-phase pattern:
- `/pdf-research <topic>` — PDF papers → NotebookLM → analysis
- `/podcast-to-notebook <feed-url>` — podcast feed → episode transcripts (via NotebookLM) → topic-aware analysis
- `/csv-research <file>` — CSV / TSV / Parquet → analytics queries against the data
4. **Translation of command files** — currently command bodies are bilingual where it matters but a few are Russian-only. Extract user-visible strings into a small translation table or duplicate each command with `.en.md` / `.ru.md` suffix.
5. **Telegram chunker improvements** — current chunker is forum-aware, but could also handle:
- WhatsApp exports (different format)
- Discord channel exports (also has thread structure)
- Slack channel exports (per-channel JSON)

## What we will not merge

- Changes that introduce a non-stdlib dependency in `scripts/telegram-chunker.py`. The whole point is "drop-in script, zero install." If you need a non-stdlib lib, write a sister script.
- Changes that bypass `wait=true` semantics in command files. The deterministic workflow contract depends on it.
- Slash commands that wrap a single MCP tool call with no added value. If it is one-call, it is not a workflow — the user should call the MCP tool directly.
- Pull requests that rename existing commands without a CHANGELOG entry under "Breaking changes" and a deprecation alias kept for at least one minor version.

## Pull request checklist

- [ ] If you added a slash command: it has a clear `## Phase 1` / `## Phase 2` / `## Phase 3` structure (or a documented reason for fewer)
- [ ] If you touched `scripts/telegram-chunker.py`: `python -m py_compile` clean, stdlib-only, `--list-topics` smoke test passed locally
- [ ] If you touched `scripts/nlm-auth-check.sh`: `bash -n` clean, ShellCheck severity `error` clean
- [ ] `README.md` updated AND `README.ru.md` mirrored
- [ ] `CHANGELOG.md` entry added under Unreleased or a new minor version
- [ ] `validate.yml` workflow passes locally

## Style

- Command files: imperative voice (*"Create a notebook"*, *"Add each source"*). Phase headings are bold and numbered.
- Bash: `bash -n` and ShellCheck-error clean. `[[ ]]` over `[ ]`, `$()` over backticks, quoted variable expansions.
- Python: stdlib only, type hints on public functions, no print debugging in committed code.
- One feature per PR. Stack PRs if you have multiple.

## Author / maintainer

[@CreatmanCEO](https://github.com/CreatmanCEO) — Nick Podolyak. Open an issue first for anything larger than one command file or one chunker improvement.
Loading
Loading