Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: CI

# Builds the site from the committed mirror/ and audits link health.
# This is the required status check for pull requests into main.
on:
pull_request:
branches: [main]
push:
branches: [main]

permissions:
contents: read

jobs:
build-and-audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true

- name: Install dependencies
run: make install

- name: Build dist/
run: make dist

- name: Audit link health
run: make audit
55 changes: 55 additions & 0 deletions .github/workflows/deploy-pages.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: Deploy to GitHub Pages

# On every push to main (i.e. after a PR merges), build the site from the
# committed mirror/ and deploy dist/ to GitHub Pages. No archive.org crawl is
# needed — mirror/ is the committed source of truth.
on:
push:
branches: [main]
workflow_dispatch:

permissions:
contents: read
pages: write
id-token: write

# Allow only one concurrent deployment; don't cancel an in-progress one.
concurrency:
group: pages
cancel-in-progress: false

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true

- name: Install dependencies
run: make install

- name: Build dist/
run: make dist

- name: Configure Pages
uses: actions/configure-pages@v5

- name: Upload dist/ artifact
uses: actions/upload-pages-artifact@v3
with:
path: dist

deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
45 changes: 45 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Release

# Triggered by a vX.Y.Z tag (created by `make release`). Builds the site,
# packages dist/ as a downloadable zip, and publishes a GitHub Release whose
# notes are the annotated tag's changelog.
on:
push:
tags: ["v*"]

permissions:
contents: write

jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # need the annotated tag object for its changelog

- name: Set up uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true

- name: Install dependencies
run: make install

- name: Build dist/
run: make dist

- name: Package dist/ as a zip
run: |
( cd dist && zip -qr "../annotated-gd-lyrics-${GITHUB_REF_NAME}.zip" . )

- name: Publish GitHub Release
env:
GH_TOKEN: ${{ github.token }}
run: |
NOTES=$(git for-each-ref "refs/tags/${GITHUB_REF_NAME}" --format='%(contents)')
[ -z "$NOTES" ] && NOTES="Release ${GITHUB_REF_NAME}"
gh release create "${GITHUB_REF_NAME}" \
--title "${GITHUB_REF_NAME}" \
--notes "$NOTES" \
"annotated-gd-lyrics-${GITHUB_REF_NAME}.zip"
82 changes: 82 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# AGENTS.md — conventions for this repo

Conventions for anyone (human or AI agent) working in this repository. Keep
changes consistent with what's here; this file is the canonical reference and
is summarized in the README and CONTRIBUTING.md.

## What this project is

A faithful, offline preservation of David Dodd's 1990s *Annotated Grateful Dead
Lyrics*, recovered from the Internet Archive. The flow:

```
mirror/ ──build_site.py──► dist/ ──CI──► GitHub Pages
(committed source of truth) (generated)
```

## The golden rule: never hand-edit `mirror/` or `dist/`

- **`mirror/`** is the byte-for-byte archived source of truth. Treat it as
read-only. It is only ever (re)written by `scripts/mirror.py`.
- **`dist/`** is generated by `make dist` and is gitignored. Never edit it.
- **All content/link fixes are expressed as code** in `scripts/build_site.py`,
so they are repeatable and reviewable. There are dedicated passes:
- `HTML_FIXES` — exact literal repairs for specific malformed source tags.
- `REDIRECTS` — dead internal links whose real target lives under another name.
- `ALT_LINKS` — known-dead external links → the generated `link-gone.html`.
- anchor repair — close-match fixing of broken `#fragments`.
When you fix a link, add to the right pass; don't patch output by hand.

## Quality gate

Every change must keep the link audit green:

```bash
make dist && make audit # audit exits non-zero on real broken links
```

`make audit` is the required CI check on pull requests. It fails only on *real*
broken internal links or case-mismatches; defects already present in the 1990s
source (malformed fragments, never-created anchors) are reported but tolerated.

## Commit messages — Conventional Commits (this drives releases)

Releases are versioned automatically from commit history, so the prefix matters:

| Prefix | Example | Release effect |
|--------|---------|----------------|
| `feat:` | `feat: add alt link for gdhour` | minor bump |
| `fix:` / `perf:` | `fix: repair broken biblio anchor` | patch bump |
| `feat!:` / `BREAKING CHANGE:` | `feat!: restructure dist layout` | major bump |
| `docs:` `chore:` `refactor:` `test:` `ci:` `build:` | — | no release |

Use the imperative mood. Scope is optional, e.g. `fix(build): …`.

## Pull request flow

`main` is protected — **no direct pushes**. All changes go through a PR:

1. Branch off `main`.
2. Make the change (in `scripts/`, docs, or workflows — never in `mirror/`/`dist/`).
3. `make dist && make audit` locally.
4. Open a PR; CI (`make install`/`dist`/`audit`) must pass.
5. Merge → `deploy-pages` publishes the updated site to GitHub Pages.

## Releases

- `make release-dryrun` previews the next version + changelog.
- `make release` computes the next semver from Conventional Commits since the
last `v*` tag, then creates and pushes an annotated `vX.Y.Z` tag (no commit to
`main`). The tag triggers `.github/workflows/release.yml`, which builds the
site, attaches `dist.zip`, and publishes a GitHub Release.

## Common commands

```bash
make install # uv sync
make mirror # (re)download the raw archive — rarely needed; ~30-45 min
make dist # build dist/ from mirror/
make audit # link-health audit of dist/
make serve-dist # serve dist/ at http://localhost:8000
make all # mirror (if missing) -> dist -> audit -> serve
```
10 changes: 10 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# CLAUDE.md

This project's conventions for AI agents live in **[AGENTS.md](AGENTS.md)** —
read it first. Key points:

- Never hand-edit `mirror/` (committed source of truth) or `dist/` (generated).
Express all link/content fixes as code in `scripts/build_site.py` passes.
- Keep the audit green: `make dist && make audit`.
- Use Conventional Commits (`feat:`/`fix:`/`feat!:`) — they drive releases.
- `main` is PR-protected; all changes go through a pull request.
71 changes: 71 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Contributing

Thanks for helping preserve *The Annotated Grateful Dead Lyrics*! This is a
small, focused project: a faithful offline mirror of the 1990s site, with link
fixes applied at build time. Contributions are welcome — especially repairing
links and adding good substitutes for dead external ones.

See **[AGENTS.md](AGENTS.md)** for the full conventions; this is the short,
human-friendly version.

## Setup

```bash
make install # installs deps with uv
make dist # build the browsable site into dist/
make serve-dist # view it at http://localhost:8000
make audit # check link health
```

You do **not** need to run `make mirror` (the ~30–45 min archive crawl) — the
`mirror/` source is committed.

## The one rule

**Never edit `mirror/` or `dist/` by hand.**

- `mirror/` is the byte-for-byte archived source of truth.
- `dist/` is generated and gitignored.

All fixes are expressed as code in `scripts/build_site.py`, so they're
repeatable and reviewable.

## Common contributions

**Fix a broken internal link** (a page that exists under a different name):
add an entry to `REDIRECTS` in `scripts/build_site.py`.

**Repair a malformed link in the source** (missing quote, typo'd tag): add an
exact literal `(bad, good)` entry to `HTML_FIXES`, keyed by the file.

**Offer an alternative for a dead external link**: add an entry to `ALT_LINKS`
— it renders on the generated `link-gone.html` page. Please link to a real,
still-working resource (e.g. a Wikipedia article), not a guess.

After any change: `make dist && make audit` must pass (the audit is the CI gate).

## Commit messages

We use [Conventional Commits](https://www.conventionalcommits.org/) — they
drive automatic versioning:

- `feat: …` → minor release (e.g. `feat: add alt link for the Grateful Dead Hour`)
- `fix: …` / `perf: …` → patch release (e.g. `fix: repair broken biblio anchor`)
- `feat!: …` or a `BREAKING CHANGE:` footer → major release
- `docs:`, `chore:`, `refactor:`, `ci:`, `build:`, `test:` → no release

## Pull requests

`main` is protected, so:

1. Branch off `main`.
2. Make your change and run `make dist && make audit`.
3. Open a PR. CI must pass before it can merge.
4. On merge, the site redeploys to GitHub Pages automatically.

## A note on faithfulness

This is a *preservation*. We fix links so the site is navigable, but we don't
rewrite the original authors' content, styling, or period HTML. When a link is
genuinely dead with no honest substitute, we leave it rather than invent a
destination.
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: install mirror mirror-retry dist audit serve-dist all clean help
.PHONY: install mirror mirror-retry dist audit serve-dist all release release-dryrun clean help

help: ## Show this help message
@echo 'Usage: make [target]'
Expand Down Expand Up @@ -48,6 +48,12 @@ all: ## Full pipeline: mirror (only if missing) -> build -> audit -> serve
@$(MAKE) audit
@$(MAKE) serve-dist

release: ## Tag a semver release from conventional commits (pushes tag, triggers CI release)
@./scripts/release.sh $(VERSION)

release-dryrun: ## Preview the next release version + changelog without tagging
@./scripts/release.sh --dry-run $(VERSION)

clean: ## Remove build artifacts (dist/, logs, caches) — mirror/ is kept
rm -rf dist/
rm -f mirror.log
Expand Down
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ A self-contained, offline, **faithful preservation** of David Dodd's
(`artsites.ucsc.edu/GDead/agdl/`), recovered from the Internet Archive and
made fully browsable on its own, with the period HTML preserved byte-for-byte.

**Live site: https://ds17f.github.io/annotatedDead/**

> The original site is frozen/offline. This project rebuilds it from a single
> archive.org snapshot (timestamp `20230806233010`) and fixes the links so it
> works without the dead live domain.
Expand Down Expand Up @@ -183,6 +185,8 @@ scripts/
mirror.py # the raw crawler (make mirror / mirror-retry)
build_site.py # the cleanup build (make dist)
audit_links.py # the link auditor (make audit)
release.sh # tag a semver release (make release)
.github/workflows/ # CI, Pages deploy, release automation
Makefile # all commands — run `make help`
.mirror_state/ # crawler resume state (gitignored)
```
Expand All @@ -191,6 +195,31 @@ Run `make help` for the full target list.

---

## Hosting & releases

The site is hosted on **GitHub Pages** and deploys automatically:

- **Every merge to `main`** runs CI (build + link audit) and, on success,
publishes the site to Pages (`.github/workflows/deploy-pages.yml`). The build
uses the committed `mirror/`, so no archive.org crawl happens in CI.
- **Releases are semver-tagged.** `make release` reads
[Conventional Commits](https://www.conventionalcommits.org/) since the last
`v*` tag, picks the next version, and pushes a `vX.Y.Z` tag. That triggers
`release.yml`, which builds the site, attaches `dist.zip`, and publishes a
GitHub Release. Preview first with `make release-dryrun`.

## Contributing

`main` is protected — all changes go through a pull request with passing CI.
The cardinal rule: **never hand-edit `mirror/` or `dist/`** — express link and
content fixes as code in `scripts/build_site.py` (`HTML_FIXES`, `REDIRECTS`,
`ALT_LINKS`, anchor repair), so they're repeatable and reviewable.

See **[CONTRIBUTING.md](CONTRIBUTING.md)** for the workflow and
**[AGENTS.md](AGENTS.md)** for the full conventions (also what AI agents follow).

---

## Content source & credits

Content is *The Annotated Grateful Dead Lyrics* by **David Dodd**, originally
Expand Down
Loading
Loading