From a1f7c445bfcc100e54d176c34bcee698f81e7c83 Mon Sep 17 00:00:00 2001
From: Noah Miller <noah.miller.012@gmail.com>
Date: Fri, 8 May 2026 11:16:41 -0400
Subject: [PATCH 1/2] Expand newproject research scaffold

---
 .claude/skills/newproject/README.md           |  91 ++++++++
 .claude/skills/newproject/SKILL.md            | 200 ++++++++++++------
 .claude/skills/newproject/templates/config.R  |  15 ++
 .claude/skills/newproject/templates/config.do |  26 +++
 .claude/skills/newproject/templates/config.py |  14 ++
 .../newproject/templates/project_CLAUDE.md    |  84 ++++++++
 .../newproject/templates/requirements.txt     |   2 +
 skills/newproject/README.md                   |  55 +++--
 8 files changed, 404 insertions(+), 83 deletions(-)
 create mode 100644 .claude/skills/newproject/README.md
 create mode 100644 .claude/skills/newproject/templates/config.R
 create mode 100644 .claude/skills/newproject/templates/config.do
 create mode 100644 .claude/skills/newproject/templates/config.py
 create mode 100644 .claude/skills/newproject/templates/project_CLAUDE.md
 create mode 100644 .claude/skills/newproject/templates/requirements.txt

diff --git a/.claude/skills/newproject/README.md b/.claude/skills/newproject/README.md
new file mode 100644
index 0000000..c7c5aba
--- /dev/null
+++ b/.claude/skills/newproject/README.md
@@ -0,0 +1,91 @@
+# `/newproject` — Standard Project Scaffold
+
+Invoked at the start of every new research project to create a consistent directory structure.
+
+## Usage
+
+```
+/newproject my-project-name
+```
+
+## What It Creates
+
+```
+my-project-name/
+├── CLAUDE.md                          # Research rules & estimation philosophy (from template)
+├── README.md                          # This file — project-specific notes
+├── code/
+│   ├── config.do                      # Canonical Stata globals & paths
+│   ├── config.py                      # Canonical Python paths (pathlib)
+│   ├── config.R                       # Canonical R paths
+│   ├── requirements.txt               # Python dependencies
+│   ├── download/                      # Scripts for pulling raw data
+│   ├── data/
+│   │   └── validation/                # Data validation scripts
+│   └── analysis/
+│       ├── stata/
+│       ├── R/
+│       └── python/
+├── data/
+│   ├── raw/                           # Original source data (never modify)
+│   └── clean/                         # Cleaned/merged datasets
+├── output/
+│   ├── figures/
+│   ├── tables/
+│   └── logs/                          # Log files from all scripts
+├── documents/                         # Outside PDFs, papers (split with /split-pdf)
+├── references/
+│   └── raw/                           # Paper PDFs for reference-ingest skills
+├── decks/                             # Beamer presentations
+├── notes/                             # Personal scratch notes; ignored by git in git-enabled projects
+├── agent_memory/                      # Shared Claude/Codex reference files
+│   ├── key_decisions.md               # Methodological choices and rationale
+│   ├── dropped_analyses.md            # Paths abandoned and why
+│   ├── codebook.md                    # Variable definitions
+│   └── sample_restrictions.md         # Who's in/out of the sample
+├── correspondence/                    # Letters, emails, audit reports
+│                                      # `referee2/` and `blindspot/` subdirs are created
+│                                      # lazily by those skills on first use
+└── progress_logs/                     # Session logs for continuity across Claude conversations
+```
+
+Use `references/raw/` for PDFs of papers or other sources you want to process with `/split-pdf`, `/read-pdf`, `/bib-update`, and/or `/wiki-update`. The reference-ingest skills create their own derived files lazily: `/wiki-update` creates `references/wiki/`, and `/bib-update` creates `references/references.bib`.
+
+If wanting to link your project to Obsidian, simply go to `Obsidian → Manage vaults → Open folder as vault` and select your project folder as a new vault. This allows each project to be cleanly differentiated and you won't have to worry about backlink-collision across projects.
+
+## Philosophy
+
+### Two Configuration Files, Two Purposes
+
+Every project has both a `CLAUDE.md` and a `README.md`. They serve different roles:
+
+- **`CLAUDE.md`** is copied from a permanent template at `~/.claude/skills/newproject/templates/project_CLAUDE.md`. It contains research rules that apply across all sessions: estimation philosophy ("design before results"), coding conventions, collaborator information, and key methodological decisions. It is the *institutional memory* of the project — the file that keeps Claude aligned across conversations.
+
+- **`agent_memory/`** holds the shared reference files that Claude and the user should version over time: decisions, dropped analyses, codebook entries, and sample restrictions.
+
+- **`notes/`** is for private scratch notes and anything you do not want versioned.
+
+- **`README.md`** is auto-generated by this command and then edited as the project evolves. It is project-specific: what the research question is, who's involved, current status, and how files are organized. It's the file a human reads to understand the project.
+
+### Session Continuity
+
+The `progress_logs/` directory solves a real problem: Claude Code sessions don't persist. If a session crashes, times out, or you start fresh, the progress log tells the next session exactly where you left off. Logs are dated (`YYYY-MM-DD_description.md`) and maintained regularly.
+
+### Documents and Decks
+
+- `documents/` holds outside PDFs — papers you're reading, referee reports, data documentation. These are candidates for the `/split-pdf` skill, which splits large PDFs into safe 4-page chunks for reading.
+
+- `decks/` holds Beamer presentations built following the rhetoric of decks philosophy (`~/.claude/skills/beautiful_deck/rhetoric_of_decks.md`). Titles are assertions, one idea per slide, beauty is function.
+
+### Data Discipline
+
+- `data/raw/` is **read-only** by convention. Original source data goes here and is never modified.
+- `data/clean/` holds everything that's been transformed, merged, or constructed. Scripts in `code/` take raw data and produce clean data.
+
+## First-time setup
+
+Before using a generated project, fill in the placeholders in its root `CLAUDE.md`, especially `[NAME]`, collaborators, project overview, data sources, and key files. Put durable project memory in `agent_memory/`, not `notes/`.
+
+## Installation
+
+This skill lives at `~/.claude/skills/newproject/SKILL.md`. Invoke with `/newproject`.
diff --git a/.claude/skills/newproject/SKILL.md b/.claude/skills/newproject/SKILL.md
index 2349f0a..4c9d31f 100644
--- a/.claude/skills/newproject/SKILL.md
+++ b/.claude/skills/newproject/SKILL.md
@@ -1,87 +1,155 @@
 ---
 name: newproject
-description: Scaffold a new research project with standard directory structure, CLAUDE.md template, and documented README. Use this at the start of every new project to ensure consistent organization.
+description: Scaffold a new research project with standard directory structure, CLAUDE.md template, and language-agnostic config files (Stata/Python/R). Use this at the start of every new project to ensure consistent organization.
 allowed-tools: Bash(mkdir*), Bash(cp*), Bash(ls*), Write, Read
 argument-hint: [project-name]
 ---
 
 # New Project Scaffold
 
-Create a new research project folder with Scott's standard structure. This skill is invoked at the start of every project.
+Create a new research project folder with the standard structure.
+
+## Templates
+
+All templates are stored locally in the skill at `~/.claude/skills/newproject/templates/`:
+
+- `project_CLAUDE.md` — project root CLAUDE.md
+- `config.do`, `config.py`, `config.R` — canonical paths for Stata / Python / R
+- `requirements.txt` — Python dependencies stub
+
+Templates use `{{PROJECT_ROOT}}` and `{{PROJECT_NAME}}` placeholders that this skill substitutes at scaffold time.
 
 ## What Gets Created
 
 ```
 [project-name]/
-├── CLAUDE.md              # Permanent research rules (copied from template)
-├── README.md              # Project-specific overview (auto-generated)
+├── CLAUDE.md                          # Research rules (from template)
+├── README.md                          # Project-specific overview (auto-generated)
 ├── code/
-│   ├── R/
-│   ├── python/
-│   └── stata/
+│   ├── config.do                      # Canonical Stata globals & paths
+│   ├── config.py                      # Canonical Python paths (pathlib)
+│   ├── config.R                       # Canonical R paths
+│   ├── requirements.txt               # Python dependencies
+│   ├── download/                      # Scripts for pulling raw data
+│   ├── data/
+│   │   └── validation/                # Data validation scripts
+│   └── analysis/
+│       ├── stata/
+│       ├── R/
+│       └── python/
 ├── data/
-│   ├── raw/               # Original source data (never modify)
-│   └── clean/             # Cleaned/merged datasets
+│   ├── raw/                           # Original source data (never modify)
+│   └── clean/                         # Cleaned/merged datasets
 ├── output/
+│   ├── figures/
 │   ├── tables/
-│   └── figures/
-├── documents/             # Outside PDFs, papers (use /split-pdf on these)
-├── decks/                 # Beamer presentations (rhetoric of decks)
-├── notes/                 # Scratch notes, random ideas, misc
-└── progress_logs/         # Session continuity across Claude conversations
+│   └── logs/                          # Log files from all scripts
+├── documents/                         # Outside PDFs, papers
+├── references/
+│   └── raw/                           # PDFs for split-pdf/read-pdf/wiki-update/bib-update
+├── decks/                             # Beamer presentations
+├── notes/                            # Personal scratch notes; ignored by git in git-enabled projects
+├── agent_memory/                      # Shared Claude/Codex reference files for this project
+├── correspondence/                   # Letters, emails, referee reports (subdirs created lazily by /referee2 and /blindspot)
+└── progress_logs/                    # Session continuity logs
 ```
 
 ## Execution
 
-1. **Get the project name** from the argument. If none provided, ask.
-   - Convert spaces to hyphens, lowercase
-
-2. **Determine location** — default is current working directory. Confirm if unclear.
-
-3. **Create all directories:**
-   ```bash
-   mkdir -p [project-name]/{code/{R,stata,python},data/{raw,clean},output/{figures,tables},documents,decks,notes,progress_logs}
-   ```
-
-4. **Copy CLAUDE.md** from `~/mixtapetools/claude/CLAUDE.md`:
-   - Replace `[Your Name]` with `Scott`
-   - Update project name in the overview section
-
-5. **Generate README.md** with:
-   - Project title
-   - Visual directory tree in a fenced code block (monospace)
-   - Explanation of each folder's purpose
-   - Note that CLAUDE.md is copied from a permanent template and edited per-project
-   - Note that README.md is for project-specific documentation
-   - Note that progress_logs/ maintains continuity across Claude sessions
-   - Placeholder sections: Overview, Collaborators, Status, Key Files
-
-   The README must include this tree block:
-
-   ````markdown
-   ```
-   [project-name]/
-   ├── CLAUDE.md              # Research rules & estimation philosophy (permanent)
-   ├── README.md              # This file — project-specific notes
-   ├── code/
-   │   ├── R/                 # R scripts
-   │   ├── python/            # Python scripts
-   │   └── stata/             # Stata do-files
-   ├── data/
-   │   ├── raw/               # Original source data (never modify these)
-   │   └── clean/             # Cleaned and merged datasets
-   ├── output/
-   │   ├── tables/            # Generated tables (LaTeX, CSV)
-   │   └── figures/           # Generated figures (PDF, PNG)
-   ├── documents/             # Outside papers and PDFs (split with /split-pdf)
-   ├── decks/                 # Beamer presentations (rhetoric of decks philosophy)
-   ├── notes/                 # Scratch notes, ideas, miscellaneous
-   └── progress_logs/         # Session logs for continuity across Claude conversations
-   ```
-   ````
-
-6. **Create initial progress log** at `progress_logs/YYYY-MM-DD_setup.md`:
-   - Record the creation date
-   - List next steps as a checklist
-
-7. **Report success** — show structure with `ls`, remind user to update CLAUDE.md.
+### Step 1 — Get project name
+If no argument was provided, ask:
+> "What should I name this project? (will be used as the folder name and in templates)"
+
+Normalize: lowercase, spaces → hyphens.
+
+### Step 2 — Determine location
+Default to current working directory. If unclear, confirm with the user.
+
+Set `PROJECT_ROOT` = `[location]/[project-name]` as an absolute path.
+
+### Step 3 — Create all directories
+
+```bash
+mkdir -p [project-name]/{code/{download,data/validation,analysis/{stata,R,python}},data/{raw,clean},output/{figures,tables,logs},documents,references/raw,decks,notes,agent_memory,correspondence,progress_logs}
+```
+
+### Step 4 — Render config files from templates
+
+For each of `config.do`, `config.py`, `config.R`, `requirements.txt`:
+1. Read `~/.claude/skills/newproject/templates/<filename>`
+2. Substitute `{{PROJECT_ROOT}}` with the absolute project root path, and `{{PROJECT_NAME}}` with the normalized project name
+3. Write to `[project-name]/code/<filename>`
+
+### Step 5 — Create `CLAUDE.md` from template
+
+Read `~/.claude/skills/newproject/templates/project_CLAUDE.md`.
+Write it to `[project-name]/CLAUDE.md` as-is.
+Update the Project Overview section heading to reference the project name.
+
+### Step 5b — Create index stubs in `agent_memory/`
+
+CLAUDE.md points to these files rather than embedding their content. Create each as an empty stub so Claude and the user have a known location to append to.
+
+**`agent_memory/key_decisions.md`**:
+```markdown
+# Key Decisions — [project-name]
+
+Running log of methodological decisions. Append new rows; do not edit prior entries.
+
+| Date | Decision | Rationale |
+|------|----------|-----------|
+```
+
+**`agent_memory/dropped_analyses.md`**:
+```markdown
+# Dropped Analyses — [project-name]
+
+Analyses tried and abandoned — so they don't get re-suggested.
+
+- **[Analysis name]** ([YYYY-MM-DD]): [Why dropped]
+```
+
+**`agent_memory/codebook.md`**:
+```markdown
+# Codebook — [project-name]
+
+Definitions of key variables, especially constructed ones.
+
+| Variable | Definition | Source |
+|----------|------------|--------|
+```
+
+**`agent_memory/sample_restrictions.md`**:
+```markdown
+# Sample Restrictions — [project-name]
+
+Who's in the sample and why. Document exclusions with counts.
+
+- [Restriction]: [Rationale] ([N excluded])
+```
+
+### Step 6 — Generate `README.md`
+
+Include:
+- Project title and one-line description placeholder
+- Visual directory tree (fenced code block matching structure above)
+- Explanation of each folder's purpose
+- Note that `references/raw/` stores paper PDFs for `/split-pdf`, `/read-pdf`, `/bib-update`, and `/wiki-update`; `references/wiki/` and `references/references.bib` are created lazily by those skills
+- Note that `CLAUDE.md` is from a permanent template — edit per-project
+- Note that `code/config.*` files define all paths — update `root` if project moves
+- Note that `progress_logs/` maintains continuity across Claude sessions
+- Placeholder sections: Overview, Collaborators, Status, Key Files
+
+### Step 7 — Create initial progress log
+
+Write `progress_logs/[YYYY-MM-DD]_setup.md`:
+- Creation date
+- Checklist of standard next steps (add data sources, fill in CLAUDE.md, etc.)
+
+### Step 8 — Report success
+
+Show the created structure with `ls -R [project-name] | head -60`.
+Remind the user to:
+- Fill in the Project Overview in `CLAUDE.md`
+- Update `code/config.*` files if the project root ever moves
+- Add Python packages to `code/requirements.txt` as needed
diff --git a/.claude/skills/newproject/templates/config.R b/.claude/skills/newproject/templates/config.R
new file mode 100644
index 0000000..8c5209a
--- /dev/null
+++ b/.claude/skills/newproject/templates/config.R
@@ -0,0 +1,15 @@
+# config.R — project paths
+# Source at the top of every R script: source("[relative path to]/code/config.R")
+
+root <- "{{PROJECT_ROOT}}"
+
+data_raw    <- file.path(root, "data", "raw")
+data_clean  <- file.path(root, "data", "clean")
+
+output_figures <- file.path(root, "output", "figures")
+output_tables  <- file.path(root, "output", "tables")
+output_logs    <- file.path(root, "output", "logs")
+
+code_download <- file.path(root, "code", "download")
+code_data     <- file.path(root, "code", "data")
+code_analysis <- file.path(root, "code", "analysis")
diff --git a/.claude/skills/newproject/templates/config.do b/.claude/skills/newproject/templates/config.do
new file mode 100644
index 0000000..bcf3fdb
--- /dev/null
+++ b/.claude/skills/newproject/templates/config.do
@@ -0,0 +1,26 @@
+* config.do — project globals
+* Include at the top of every do-file: include "[relative path to]/code/config.do"
+
+set more off
+set linesize 120
+
+* ── Project root ────────────────────────────────────────────────────────────
+global root "{{PROJECT_ROOT}}"
+
+* ── Data ────────────────────────────────────────────────────────────────────
+global raw    "$root/data/raw"
+global clean  "$root/data/clean"
+
+* ── Code ────────────────────────────────────────────────────────────────────
+global download  "$root/code/download"
+global code_data "$root/code/data"
+global analysis  "$root/code/analysis"
+
+* ── Output ──────────────────────────────────────────────────────────────────
+global tables  "$root/output/tables"
+global figures "$root/output/figures"
+global logs    "$root/output/logs"
+
+* ── Log setup (uncomment to activate in any do-file) ────────────────────────
+* cap log close
+* log using "$logs/${SCRIPTNAME}.log", replace text
diff --git a/.claude/skills/newproject/templates/config.py b/.claude/skills/newproject/templates/config.py
new file mode 100644
index 0000000..a85fbb9
--- /dev/null
+++ b/.claude/skills/newproject/templates/config.py
@@ -0,0 +1,14 @@
+from pathlib import Path
+
+ROOT = Path("{{PROJECT_ROOT}}")
+
+DATA_RAW    = ROOT / "data" / "raw"
+DATA_CLEAN  = ROOT / "data" / "clean"
+
+OUTPUT_FIGURES = ROOT / "output" / "figures"
+OUTPUT_TABLES  = ROOT / "output" / "tables"
+OUTPUT_LOGS    = ROOT / "output" / "logs"
+
+CODE_DOWNLOAD = ROOT / "code" / "download"
+CODE_DATA     = ROOT / "code" / "data"
+CODE_ANALYSIS = ROOT / "code" / "analysis"
diff --git a/.claude/skills/newproject/templates/project_CLAUDE.md b/.claude/skills/newproject/templates/project_CLAUDE.md
new file mode 100644
index 0000000..986cd4f
--- /dev/null
+++ b/.claude/skills/newproject/templates/project_CLAUDE.md
@@ -0,0 +1,84 @@
+# CLAUDE.md Template for Research Projects
+
+> Copy this file to your project root and fill in the sections below.
+
+---
+
+## Communication Guidelines
+
+- Refer to the user as **[NAME]**
+- Collaborators: [List collaborators and their roles]
+
+---
+
+## Estimation Philosophy
+
+**Design before results.** During estimation and analysis:
+
+- Do NOT express concern or excitement about point estimates
+- Do NOT interpret results as "good" or "bad" until the design is intentional
+- Focus entirely on whether the specification is correct
+- Results are meaningless until we're confident the "experiment" is designed on purpose
+- Objectivity means being attached to getting the design right, not to any particular finding
+
+---
+
+## Project Overview
+
+[2-3 paragraph description of your project]
+
+### Research Question
+
+[What are you trying to answer?]
+
+### Data Sources
+
+[What data are you using? Time periods? Geographic coverage?]
+
+### Identification Strategy
+
+[How are you identifying causal effects? What's the source of variation?]
+
+---
+
+## Key Files
+
+- **Main analysis**: `path/to/script.R` or `script.py`
+- **Data cleaning**: `path/to/cleaning.R`
+- **Paper draft**: `path/to/paper.tex`
+- **Presentation**: `path/to/slides.tex`
+- **Reference PDFs**: `references/raw/` stores papers and other PDFs for `/split-pdf`, `/read-pdf`, `/bib-update`, and `/wiki-update`.
+- **Reference wiki and BibTeX**: `/wiki-update` lazy-creates `references/wiki/`; `/bib-update` lazy-creates `references/references.bib`. Do not create or maintain those files here unless the reference skills have initialized them.
+
+### Conventions
+
+- **`data/raw/` is immutable** — never edit or delete source files. All cleaning and transformations happen in `code/` with outputs to `data/clean/`.
+- Include random seeds for any stochastic analyses.
+
+### Analysis output conventions
+
+Unless the user explicitly specifies otherwise:
+
+- **Tables** (from any analysis script) → `output/tables/` as standalone `.tex` fragments. No preamble, no `\documentclass` — just the `\begin{tabular}…\end{tabular}` or `\begin{table}…\end{table}` block, suitable for `\input{}`.
+- **Figures** (from any analysis script) → `output/figures/` as `.pdf` (prefer vector over raster). Use a descriptive base name; specification variants as suffixes.
+- **Compiled LaTeX documents** (a standalone `.tex` that `\input`s multiple tables and `\includegraphics`es multiple figures — e.g., `summary_stats.tex`, `conceptual_framework.tex`) → `documents/<topic>/<topic>.tex`, where `<topic>` is a short subject-derived folder name. Build artifacts (`.aux`, `.log`, `.synctex.gz`, compiled `.pdf`) live in the same subfolder. Reference tables/figures with relative paths like `\input{../../output/tables/tab_foo.tex}` and set `\graphicspath{{../../output/figures/}}`.
+- Never create a new top-level folder for LaTeX output (no `code/analysis/latex/`, no ad-hoc `figs/`). If `output/{tables,figures}` and `documents/<topic>/` don't fit a use case, pause and ask.
+
+---
+
+## Indexes (detail lives in linked files)
+
+Look here first when you need project history, codebook entries, or prior decisions. Do not duplicate this content into CLAUDE.md — update the linked file instead.
+
+- **Methodological decisions**: `agent_memory/key_decisions.md`
+- **Dropped analyses**: `agent_memory/dropped_analyses.md`
+- **Codebook (variable definitions)**: `agent_memory/codebook.md`
+- **Sample restrictions**: `agent_memory/sample_restrictions.md`
+- **Current status / next steps**: latest entry in `progress_logs/`
+- **Referee 2 correspondence**: `correspondence/referee2/` (see `/referee2` skill)
+
+---
+
+## Notes for Claude
+
+[Any specific instructions, quirks, or reminders for this project]
diff --git a/.claude/skills/newproject/templates/requirements.txt b/.claude/skills/newproject/templates/requirements.txt
new file mode 100644
index 0000000..09e918e
--- /dev/null
+++ b/.claude/skills/newproject/templates/requirements.txt
@@ -0,0 +1,2 @@
+# Python dependencies for {{PROJECT_NAME}}
+# Install with: pip install -r code/requirements.txt
diff --git a/skills/newproject/README.md b/skills/newproject/README.md
index e53bcf8..3edce0c 100644
--- a/skills/newproject/README.md
+++ b/skills/newproject/README.md
@@ -12,22 +12,39 @@ Invoked at the start of every new research project to create a consistent direct
 
 ```
 my-project-name/
-├── CLAUDE.md              # Research rules & estimation philosophy (permanent)
-├── README.md              # This file — project-specific notes
+├── CLAUDE.md                          # Research rules & estimation philosophy (from template)
+├── README.md                          # This file — project-specific notes
 ├── code/
-│   ├── R/                 # R scripts
-│   ├── python/            # Python scripts
-│   └── stata/             # Stata do-files
+│   ├── config.do                      # Canonical Stata globals & paths
+│   ├── config.py                      # Canonical Python paths (pathlib)
+│   ├── config.R                       # Canonical R paths
+│   ├── requirements.txt               # Python dependencies
+│   ├── download/                      # Scripts for pulling raw data
+│   ├── data/
+│   │   └── validation/                # Data validation scripts
+│   └── analysis/
+│       ├── stata/
+│       ├── R/
+│       └── python/
 ├── data/
-│   ├── raw/               # Original source data (never modify these)
-│   └── clean/             # Cleaned and merged datasets
+│   ├── raw/                           # Original source data (never modify)
+│   └── clean/                         # Cleaned/merged datasets
 ├── output/
-│   ├── tables/            # Generated tables (LaTeX, CSV)
-│   └── figures/           # Generated figures (PDF, PNG)
-├── documents/             # Outside papers and PDFs (split with /split-pdf)
-├── decks/                 # Beamer presentations (rhetoric of decks philosophy)
-├── notes/                 # Scratch notes, ideas, miscellaneous
-└── progress_logs/         # Session logs for continuity across Claude conversations
+│   ├── figures/
+│   ├── tables/
+│   └── logs/                          # Log files from all scripts
+├── documents/                         # Outside PDFs, papers
+├── references/
+│   └── raw/                           # Paper PDFs for reference-ingest skills
+├── decks/                             # Beamer presentations
+├── notes/                             # Personal scratch notes
+├── agent_memory/                      # Shared Claude/Codex reference files
+│   ├── key_decisions.md               # Methodological choices and rationale
+│   ├── dropped_analyses.md            # Paths abandoned and why
+│   ├── codebook.md                    # Variable definitions
+│   └── sample_restrictions.md         # Who's in/out of the sample
+├── correspondence/                    # Letters, emails, audit reports
+└── progress_logs/                     # Session logs for continuity across Claude conversations
 ```
 
 ## Philosophy
@@ -36,7 +53,11 @@ my-project-name/
 
 Every project has both a `CLAUDE.md` and a `README.md`. They serve different roles:
 
-- **`CLAUDE.md`** is copied from a permanent template at `~/mixtapetools/claude/CLAUDE.md`. It contains research rules that apply across all sessions: estimation philosophy ("design before results"), coding conventions, collaborator information, and key methodological decisions. It is the *institutional memory* of the project — the file that keeps Claude aligned across conversations.
+- **`CLAUDE.md`** is copied from a permanent template at `~/.claude/skills/newproject/templates/project_CLAUDE.md`. It contains research rules that apply across all sessions: estimation philosophy ("design before results"), coding conventions, collaborator information, and key methodological decisions. It is the *institutional memory* of the project — the file that keeps Claude aligned across conversations.
+
+- **`agent_memory/`** holds the shared reference files that Claude and the user should version over time: decisions, dropped analyses, codebook entries, and sample restrictions.
+
+- **`notes/`** is for private scratch notes and anything you do not want versioned.
 
 - **`README.md`** is auto-generated by this command and then edited as the project evolves. It is project-specific: what the research question is, who's involved, current status, and how files are organized. It's the file a human reads to understand the project.
 
@@ -48,6 +69,8 @@ The `progress_logs/` directory solves a real problem: Claude Code sessions don't
 
 - `documents/` holds outside PDFs — papers you're reading, referee reports, data documentation. These are candidates for the `/split-pdf` skill, which splits large PDFs into safe 4-page chunks for reading.
 
+- `references/raw/` holds PDFs of papers and other sources you want to process with `/split-pdf`, `/read-pdf`, `/bib-update`, and/or `/wiki-update`. `/wiki-update` creates `references/wiki/` when needed; `/bib-update` creates `references/references.bib` when needed.
+
 - `decks/` holds Beamer presentations built following the rhetoric of decks philosophy (`~/mixtapetools/presentations/rhetoric_of_decks.md`). Titles are assertions, one idea per slide, beauty is function.
 
 ### Data Discipline
@@ -57,6 +80,4 @@ The `progress_logs/` directory solves a real problem: Claude Code sessions don't
 
 ## Installation
 
-This command lives at `~/mixtapetools/.claude/commands/newproject.md` and is also available as a skill at `~/mixtapetools/.claude/skills/newproject/SKILL.md`. Both are identical in behavior.
-
-To make it available in any project, ensure `~/mixtapetools` is in your Claude Code skill search path, or symlink the `.claude` directory.
+This skill lives at `~/.claude/skills/newproject/SKILL.md`. Invoke with `/newproject`.

From 3a5ba807b428f6aaa59b86d14adaa7e09fd26f55 Mon Sep 17 00:00:00 2001
From: Noah Miller <noah.miller.012@gmail.com>
Date: Fri, 15 May 2026 16:44:39 -0400
Subject: [PATCH 2/2] Extract newproject scaffold script

---
 .claude/skills/newproject/SKILL.md            |  88 +--------
 .claude/skills/newproject/scripts/scaffold.sh | 168 ++++++++++++++++++
 2 files changed, 175 insertions(+), 81 deletions(-)
 create mode 100755 .claude/skills/newproject/scripts/scaffold.sh

diff --git a/.claude/skills/newproject/SKILL.md b/.claude/skills/newproject/SKILL.md
index 4c9d31f..8d4c805 100644
--- a/.claude/skills/newproject/SKILL.md
+++ b/.claude/skills/newproject/SKILL.md
@@ -1,7 +1,7 @@
 ---
 name: newproject
 description: Scaffold a new research project with standard directory structure, CLAUDE.md template, and language-agnostic config files (Stata/Python/R). Use this at the start of every new project to ensure consistent organization.
-allowed-tools: Bash(mkdir*), Bash(cp*), Bash(ls*), Write, Read
+allowed-tools: Bash(ls*), Bash(~/.claude/skills/newproject/scripts/scaffold.sh:*), Read
 argument-hint: [project-name]
 ---
 
@@ -56,97 +56,23 @@ Templates use `{{PROJECT_ROOT}}` and `{{PROJECT_NAME}}` placeholders that this s
 
 ## Execution
 
-### Step 1 — Get project name
+### Step 1 — Get project name and location
 If no argument was provided, ask:
 > "What should I name this project? (will be used as the folder name and in templates)"
 
-Normalize: lowercase, spaces → hyphens.
-
-### Step 2 — Determine location
 Default to current working directory. If unclear, confirm with the user.
 
-Set `PROJECT_ROOT` = `[location]/[project-name]` as an absolute path.
+### Step 2 — Run scaffold script
 
-### Step 3 — Create all directories
+Run:
 
 ```bash
-mkdir -p [project-name]/{code/{download,data/validation,analysis/{stata,R,python}},data/{raw,clean},output/{figures,tables,logs},documents,references/raw,decks,notes,agent_memory,correspondence,progress_logs}
-```
-
-### Step 4 — Render config files from templates
-
-For each of `config.do`, `config.py`, `config.R`, `requirements.txt`:
-1. Read `~/.claude/skills/newproject/templates/<filename>`
-2. Substitute `{{PROJECT_ROOT}}` with the absolute project root path, and `{{PROJECT_NAME}}` with the normalized project name
-3. Write to `[project-name]/code/<filename>`
-
-### Step 5 — Create `CLAUDE.md` from template
-
-Read `~/.claude/skills/newproject/templates/project_CLAUDE.md`.
-Write it to `[project-name]/CLAUDE.md` as-is.
-Update the Project Overview section heading to reference the project name.
-
-### Step 5b — Create index stubs in `agent_memory/`
-
-CLAUDE.md points to these files rather than embedding their content. Create each as an empty stub so Claude and the user have a known location to append to.
-
-**`agent_memory/key_decisions.md`**:
-```markdown
-# Key Decisions — [project-name]
-
-Running log of methodological decisions. Append new rows; do not edit prior entries.
-
-| Date | Decision | Rationale |
-|------|----------|-----------|
-```
-
-**`agent_memory/dropped_analyses.md`**:
-```markdown
-# Dropped Analyses — [project-name]
-
-Analyses tried and abandoned — so they don't get re-suggested.
-
-- **[Analysis name]** ([YYYY-MM-DD]): [Why dropped]
+~/.claude/skills/newproject/scripts/scaffold.sh "[project-name]" "[parent-directory]"
 ```
 
-**`agent_memory/codebook.md`**:
-```markdown
-# Codebook — [project-name]
-
-Definitions of key variables, especially constructed ones.
-
-| Variable | Definition | Source |
-|----------|------------|--------|
-```
-
-**`agent_memory/sample_restrictions.md`**:
-```markdown
-# Sample Restrictions — [project-name]
-
-Who's in the sample and why. Document exclusions with counts.
-
-- [Restriction]: [Rationale] ([N excluded])
-```
-
-### Step 6 — Generate `README.md`
-
-Include:
-- Project title and one-line description placeholder
-- Visual directory tree (fenced code block matching structure above)
-- Explanation of each folder's purpose
-- Note that `references/raw/` stores paper PDFs for `/split-pdf`, `/read-pdf`, `/bib-update`, and `/wiki-update`; `references/wiki/` and `references/references.bib` are created lazily by those skills
-- Note that `CLAUDE.md` is from a permanent template — edit per-project
-- Note that `code/config.*` files define all paths — update `root` if project moves
-- Note that `progress_logs/` maintains continuity across Claude sessions
-- Placeholder sections: Overview, Collaborators, Status, Key Files
-
-### Step 7 — Create initial progress log
-
-Write `progress_logs/[YYYY-MM-DD]_setup.md`:
-- Creation date
-- Checklist of standard next steps (add data sources, fill in CLAUDE.md, etc.)
+The script normalizes the project name, creates the directory tree, renders templates, writes agent-memory stubs, generates `README.md`, and creates the initial setup log. It prints the absolute project root.
 
-### Step 8 — Report success
+### Step 3 — Report success
 
 Show the created structure with `ls -R [project-name] | head -60`.
 Remind the user to:
diff --git a/.claude/skills/newproject/scripts/scaffold.sh b/.claude/skills/newproject/scripts/scaffold.sh
new file mode 100755
index 0000000..4c8d1a9
--- /dev/null
+++ b/.claude/skills/newproject/scripts/scaffold.sh
@@ -0,0 +1,168 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+if [ "$#" -lt 1 ] || [ "$#" -gt 2 ]; then
+  echo "Usage: scaffold.sh <project-name> [parent-directory]" >&2
+  exit 2
+fi
+
+raw_name="$1"
+parent_dir="${2:-$PWD}"
+script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+skill_dir="$(cd "$script_dir/.." && pwd)"
+template_dir="$skill_dir/templates"
+
+project_name="$(
+  printf '%s' "$raw_name" \
+    | tr '[:upper:]' '[:lower:]' \
+    | sed -E 's/[[:space:]_]+/-/g; s/[^a-z0-9.-]+/-/g; s/^-+//; s/-+$//'
+)"
+
+if [ -z "$project_name" ]; then
+  echo "Project name normalizes to empty: $raw_name" >&2
+  exit 1
+fi
+
+mkdir -p "$parent_dir"
+parent_abs="$(cd "$parent_dir" && pwd)"
+project_root="$parent_abs/$project_name"
+
+if [ -e "$project_root" ] && [ "$(find "$project_root" -mindepth 1 -maxdepth 1 2>/dev/null | head -n 1)" ]; then
+  echo "Project directory already exists and is not empty: $project_root" >&2
+  exit 1
+fi
+
+mkdir -p "$project_root"/{code/{download,data/validation,analysis/{stata,R,python}},data/{raw,clean},output/{figures,tables,logs},documents,references/raw,decks,notes,agent_memory,correspondence,progress_logs}
+
+render_template() {
+  local source="$1"
+  local target="$2"
+
+  sed \
+    -e "s|{{PROJECT_ROOT}}|$project_root|g" \
+    -e "s|{{PROJECT_NAME}}|$project_name|g" \
+    "$source" > "$target"
+}
+
+for filename in config.do config.py config.R requirements.txt; do
+  render_template "$template_dir/$filename" "$project_root/code/$filename"
+done
+
+render_template "$template_dir/project_CLAUDE.md" "$project_root/CLAUDE.md"
+sed -i.bak "s/^## Project Overview$/## Project Overview — $project_name/" "$project_root/CLAUDE.md"
+rm -f "$project_root/CLAUDE.md.bak"
+
+cat > "$project_root/agent_memory/key_decisions.md" <<EOF
+# Key Decisions — $project_name
+
+Running log of methodological decisions. Append new rows; do not edit prior entries.
+
+| Date | Decision | Rationale |
+|------|----------|-----------|
+EOF
+
+cat > "$project_root/agent_memory/dropped_analyses.md" <<EOF
+# Dropped Analyses — $project_name
+
+Analyses tried and abandoned — so they don't get re-suggested.
+
+- **[Analysis name]** ([YYYY-MM-DD]): [Why dropped]
+EOF
+
+cat > "$project_root/agent_memory/codebook.md" <<EOF
+# Codebook — $project_name
+
+Definitions of key variables, especially constructed ones.
+
+| Variable | Definition | Source |
+|----------|------------|--------|
+EOF
+
+cat > "$project_root/agent_memory/sample_restrictions.md" <<EOF
+# Sample Restrictions — $project_name
+
+Who's in the sample and why. Document exclusions with counts.
+
+- [Restriction]: [Rationale] ([N excluded])
+EOF
+
+cat > "$project_root/README.md" <<EOF
+# $project_name
+
+One-line project description goes here.
+
+## Overview
+
+Fill in the project overview, research question, data sources, and identification strategy in \`CLAUDE.md\`.
+
+## Directory Structure
+
+\`\`\`
+$project_name/
+├── CLAUDE.md
+├── README.md
+├── code/
+│   ├── config.do
+│   ├── config.py
+│   ├── config.R
+│   ├── requirements.txt
+│   ├── download/
+│   ├── data/validation/
+│   └── analysis/{stata,R,python}/
+├── data/{raw,clean}/
+├── output/{figures,tables,logs}/
+├── documents/
+├── references/raw/
+├── decks/
+├── notes/
+├── agent_memory/
+├── correspondence/
+└── progress_logs/
+\`\`\`
+
+## Folder Notes
+
+- \`data/raw/\` stores immutable source data; cleaning outputs go to \`data/clean/\`.
+- \`code/config.*\` files define canonical paths. Update \`root\` if the project moves.
+- \`output/\` stores generated figures, tables, and logs.
+- \`references/raw/\` stores PDFs for \`/split-pdf\`, \`/read-pdf\`, \`/bib-update\`, and \`/wiki-update\`.
+- \`references/wiki/\` and \`references/references.bib\` are created lazily by reference skills.
+- \`agent_memory/\` stores durable decisions, codebook notes, dropped analyses, and sample restrictions.
+- \`progress_logs/\` maintains continuity across Claude/Codex sessions.
+
+## Collaborators
+
+- [Name]: [Role]
+
+## Status
+
+- [Current status]
+
+## Key Files
+
+- [Add important files here as the project develops]
+EOF
+
+today="$(date +%F)"
+cat > "$project_root/progress_logs/${today}_setup.md" <<EOF
+# Setup Log — $project_name
+
+**Date:** $today
+
+## Created
+
+- Standard research project directory structure
+- Root \`CLAUDE.md\` from template
+- Language config files in \`code/\`
+- Agent memory stubs
+- Project README
+
+## Next Steps
+
+- Fill in the Project Overview in \`CLAUDE.md\`
+- Add data sources or acquisition scripts
+- Update \`code/requirements.txt\` as Python dependencies become known
+- Initialize git separately when ready (\`git init\`, \`.gitignore\`, etc.)
+EOF
+
+printf '%s\n' "$project_root"