feat(directory): per-host dedup + featured-pinning (re-land #53 onto main)#65
Merged
Merged
Conversation
…oga flood) (#53) * feat: per-host dedup + featured-pinning for /directory (stop the aloyoga flood) The directory read as spam: the 2h Shopify cron seeds hundreds of products from a few stores, KV.list returns ~insertion order, so one host (aloyoga) flooded the page before diverse hosts appeared. directory_rank.ts (pure + unit-tested) caps each host (default 3, featured bypasses) and pins featured; /api/v1/directory reads a wide 1000-key window then diversifies, returns distinct_hosts. +6 tests. Stacked on fix/proxy-test-real-auth; base auto-retargets to main on merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(directory): count distinct_hosts after truncation; clarify per_host=0 + subdomain limits Adversarial review of the ranking found 3 minor defects (all display/clarity, no functional/security impact): 1. distinct_hosts was counted over the pre-truncation deduped list, so the 'N stores' badge could overcount when deduped.length > limit (e.g. 20 hosts, limit 10 → badge said 20, showed 10). Now counted on the returned entries. 2. per_host=0 was labelled 'full list' but limit still truncates. Corrected the doc comment, handler comment, and test name; added an assertion that limit still applies when perHost=0. 3. hostOf keys on the full host, so shop.x.com vs x.com get separate quotas. Left as-is (eTLD+1 collapsing needs a public-suffix list; sub-stores can be distinct) but documented the limitation + locked in behavior with a test. +2 tests (161 total). Verify green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: New1Direction <285551516+New1Direction@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Re-lands the directory per-host dedup from #53, which had been opened against a stale branch (
fix/proxy-test-real-auth, 41 commits behind main) instead of main.What it does: adds
worker/src/directory_rank.ts— pure, unit-tested per-host dedup + featured-first ordering, so one prolific Shopify-cron host (the "aloyoga flood") can't dominate/directory.Conflict resolved in
index.ts: kept main's deepseen:pagination (reads the full set, not just 1000 keys) AND fed it intorankDirectory(per-host cap + featured bypass + distinct_hosts), preserving main's cache-control headers.Verified: 237/237 worker tests pass (incl. new
directory_rank.test.ts),tsc --noEmitclean.#54 (governance) and #55 (leads) were already in main via other commits — those PRs were stale duplicates, nothing to re-land.