Releases: loremcc/paku
Releases · loremcc/paku
paku v1.1.0
New features
- Semantic recommendations:
GET /api/recommendationspowered by local Ollama LLM (gemma4:26b). Analyses your
collection context (top rated, genres, studios, formats, watch status) and returns personalized suggestions resolved
against AniList. SHA256 cache avoids re-querying for unchanged state.?refresh=trueforces regeneration. - Notion CSV import:
paku import-notion <csv>parses your Notion anime database export, maps 12 status values
to canonical user_statuses, strips Notion page URLs from genre/studio fields, and merges watch statuses + personal
scores into the dashboard.--dry-runpreviews matches without writing. - Dashboard Recs tab: "For You" (Ollama personalized) and "Similar to…" (AniList graph, seed selector). Skeleton
loading, Refresh, Add buttons. 5 tabs total. - Branding: SVG logo, refined dark palette, status chip colors (green=Completed, blue=Watching, amber=Plan to
Watch, red=Dropped, gray=On Hold).
Removed
- LangExtract: removed from config, CLI, docs, and tests. Ollama VLM proved sufficient on the 1287-image batch
run.
Fixes
- Anime extractor: Instagram caption opinions ("NGL THIS ANIME IS UNDERRATED:") no longer extracted as titles.
Sauce prefix stripping for comment-thread titles. Garbage filter runs before fallback ranking. "Join the
conversation..." added to chrome filter. - Ollama config split:
ocr_model(gemma4-paku:latest, VLM) andrecs_model(gemma4:26b, text LLM) with
backward-compat fallback. Prevents VLM system prompt from corrupting text generation. - Dashboard cache:
Cache-Control: no-cacheon root page prevents stale inline JS after updates.
Full Changelog: v1.0.1...v1.1.0
Install
pip install pakuRequires a config.yaml with a Google Cloud Vision key (see config.yaml.template).
License
Mozilla Public License 2.0
paku v1.0.1
What's changed
- AniList recommendations panel in Collection tab ("Recommended for you" — seeded from most-recently-added entry with an AniList ID)
GET /api/collection/{id}/recommendationsendpoint withhas_anilist_id()saved-flag lookup- Fixed
_detect_multi_titles()to usefinditerfor wrapped carousel titles (e.g. "Si-Vis: The Sound of Heroes") - CI publish job: auto-publish to PyPI on
v*tag push via OIDC Trusted Publishing
Full Changelog: v1.0.0...v1.0.1
paku v1.0.0
CLI pipeline that extracts anime titles, GitHub/web URLs, and recipes from Instagram screenshots into structured JSON, CSV, and a local dashboard.
What's in v1.0.0
Three extractors
- Anime — AniList GraphQL enrichment, Levenshtein gating, multi-title carousel detection, 507/513 screenshots AniList-matched
- URL — direct URL + project keyword resolution, 4-tier confidence scoring
- Recipe — ingredient parsing with qty/unit split, structured JSON + CSV output
Pipeline
- Batch processing with checkpoint/resume (
--resume) - Auto-classifier routes screenshots to the right extractor
- Ollama VLM smart re-run (
--smart) for low-confidence OCR results review_queue.jsoncaptures everything below threshold — nothing discarded
Dashboard
- Local web app (
paku serve) — browse, filter, and triage extracted anime entries - SQLite-backed, FastAPI + vanilla JS SPA
Quality
- 513 unit tests, 2 skipped (integration)
- Ruff lint clean across all modules
- GitHub Actions CI: lint + test matrix (Python 3.11 & 3.12) + wheel build
Install
pip install pakuRequires a config.yaml with a Google Cloud Vision key (see config.yaml.template).
License
Mozilla Public License 2.0