Skip to content

Server-side source-manifest service (Lumbra's, Bundesliga) — decouple fragile URL discovery from app releases #94

Description

@jozef2svrcek

For sources whose file lists are fragile to resolve from the client — Lumbra's Gigabase (JS-rendered .7z download links) and Austrian Bundesliga (files scattered across pages, no system) — add a small server-side manifest to the existing LPDO data service (the one that already caches player normalization, normalise.lpdo.com).

Why

If scraping/URL-reconstruction lives in the shipped daemon, any source-site change breaks every installed copy until users update the app. Server-side, we fix the scraper once and all clients keep working — they just fetch a refreshed manifest. Also: politeness (we hit the source once, fan out a cached list), a place to validate URLs (HEAD-check, sizes, sha, dates), and reuse of existing infra.

Shape

  • New endpoint(s): a JSON manifest per source[{ label, url, covers:[from,to], size, sha256?, published }].
  • Server runs source-specific resolvers on a schedule: Lumbra's (resolve the real wp-content/uploads/.../OTB_<era>_v<date>.7z URLs, curate to complete-year OTB), Bundesliga AT (crawl the scattered pages into a clean per-season file list).
  • Client side: a generic "manifest-backed feed" driver — GET the manifest, map entries to FeedItems (with covers, reusing the B2 window file-skip), download each file straight from the origin, decompress, import. The service is a thin index, never a mirror.

Guardrails

  • Thin index only — URLs + metadata, never proxy the bytes (esp. Lumbra's CC BY-NC-SA non-commercial + bandwidth).
  • Not a hard dependency — ship a bundled last-known-good manifest in the app and cache the last fetched one, so a service outage only blocks discovery of new files, not import of known ones.
  • Keep the high-level catalog compiled-in (defines what sources exist + how to acquire); only the volatile file list/URLs come from the manifest.
  • TWIC + Lichess stay self-resolving (they publish clean indexes) — not routed through the service.

Sequencing

B3 ships Lumbra's with a bundled static manifest wired through the manifest-backed driver (proves 7z + bulk import). This issue then moves manifest generation to the service as a fast follow — the client code doesn't change, only where the manifest comes from (bundled → fetched-with-bundled-fallback). Relates to #40.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions