A read-only Go CLI over a local SQLite mirror of the Cuba City community-history corpus that volunteer admin "Historical Jottings" has built since 2010:
- 516 blog posts from cubacityhistory.blogspot.com
- 63 Facebook posts from facebook.com/cubacityhistory
- 2,469 images and 345 reader comments preserved offline
It does not hit Facebook or Blogger directly — refreshing the underlying
mirror is a separate workflow that lives in the companion research-wiki repo
(thetimechain/random → wiki/topics/cuba-city-wi-history/raw/). This CLI is
the query layer over what's already on disk.
Local-history corpora maintained by single volunteer admins are some of the most fragile bodies of work on the internet: one moved-house, one platform shutdown, one ill-timed disk failure, and 15 years of community memory evaporates. This CLI is one piece of a two-piece preservation pact:
- The archive (in the wiki repo) — markdown + images + raw HTML snapshots on a local drive, mirrored to GitHub, with Wayback redundancy.
- The CLI (this repo) — a stable, agent-readable query surface that doesn't depend on the source platforms being alive.
go install github.com/thetimechain/cubacityhistory-pp-cli/cmd/cubacityhistory-pp-cli@latestOr build from source:
git clone https://github.com/thetimechain/cubacityhistory-pp-cli
cd cubacityhistory-pp-cli
go install ./cmd/cubacityhistory-pp-cli# One-time: parse the local archive into the SQLite store
cubacityhistory-pp-cli sync --archive ~/github/random/wiki/topics/cuba-city-wi-history/raw
# Verify health
cubacityhistory-pp-cli doctor
# Search across post bodies + titles
cubacityhistory-pp-cli posts search "tornado" --limit 5
# Search across comments
cubacityhistory-pp-cli comments search "school" --limit 3
# Top commenters — the community's living memory holders
cubacityhistory-pp-cli authors --limit 10
# List posts from a specific year
cubacityhistory-pp-cli posts list --year 2024 --source facebook
# Pull one post with its full comment tree
cubacityhistory-pp-cli posts get pfbid031F6NZhBBuzk3NtWATY3UMLQLdMMN59YSC8RPQZTZAhQ4Rhgm9aqjhBWv3UgDj3dCl| Command | What it does |
|---|---|
sync --archive <dir> |
Parse the local archive into SQLite. Idempotent. |
doctor |
Verify the store, report counts. |
posts list [--year Y] [--source S] [--with-images-only] |
List posts, newest first. |
posts get <id-or-permalink> |
Fetch one post + comments. |
posts search <query> [--snippet] |
FTS5 search across bodies + titles. |
comments list [--post ID] [--author NAME] |
List comments. |
comments search <query> |
FTS5 across comment bodies. |
authors |
Top commenters by count. |
--json— machine-readable output--csv— CSV output (where applicable)--compact— JSON output, high-gravity fields only--select date,title,id— pick specific fields (JSON only)--limit N— cap results (0 = unlimited; default 50)--db <path>— override the SQLite store location
Environment:
CUBACITY_DB— alternate SQLite store path (default:$HOME/.cubacityhistory/store.db)
Pure Go. Uses modernc.org/sqlite (no cgo, no compiler needed). SQLite's
built-in FTS5 powers full-text search. Schema in internal/store/store.go.
The sync command parses markdown files with YAML frontmatter (the wiki's
existing format) and populates four tables: posts, posts_fts (virtual
FTS5), comments, comments_fts, images. Re-syncing is idempotent —
truncates and re-inserts.
This CLI's data source is wiki/topics/cuba-city-wi-history/raw/ in
thetimechain/random. The two
together form the complete preservation copy.
- Blogspot comments use a
### Comment N — Author (date)header format rather than FB's### Author — _time_, so theauthorscommand showsComment 1/Comment 2as top names for the blogspot corpus. The real author lives in the comment body. Fixing this in a future revision. - Image counts in
doctorreflect what was visible on disk at sync time for FB only; blogspot images live under a different slug pattern and are not yet cross-referenced. The images themselves are preserved in the archive — this is a query-layer gap, not a preservation gap.
MIT.