Skip to content

fix(database): disambiguate same-name media with comprehensive tag mapping#1036

Merged
wizzomafizzo merged 4 commits into
mainfrom
fix/mediadb-metadata-scraper-disambiguation
Jul 1, 2026
Merged

fix(database): disambiguate same-name media with comprehensive tag mapping#1036
wizzomafizzo merged 4 commits into
mainfrom
fix/mediadb-metadata-scraper-disambiguation

Conversation

@wizzomafizzo

@wizzomafizzo wizzomafizzo commented Jul 1, 2026

Copy link
Copy Markdown
Member
  • Same-name media (e.g. three identical "Jackal" or "Finalizer" arcade entries) rendered without any distinguishing tags because their filename tokens were unparsed or ineligible for disambiguation; this maps every distinguishing token to a correctly-typed known tag (arcade boards, input devices, memory/compatibility, dump flags, protection chips, cabinet orientation, build dates, and more).
  • Adds two tag types: cabinet (upright/cocktail/cabaret/sitdown) and protection (fd1094/fd1089/8751/mc-8123/encrypted/decrypted).
  • Recomputes DisambiguationTypes over the full allowlist of eligible tag types so both value differences and presence/absence across siblings qualify, and rewrites the recompute as a single-pass set-based update (reset then set-qualifying via CTEs), parity-verified against the prior correlated-subquery form with added benchmark coverage.
  • Resolves companion gamelist slug conflicts by selecting the name-consistent parent, fixing Phantasy Star IV showing Phantasy Star III metadata.
  • Maps the spelled-out "(System 16C version)" arcade board id that was previously dropped as version noise.

Summary by CodeRabbit

  • New Features
    • Expanded filename and metadata parsing for arcade, music, and archive formats, including cabinet orientation, protection states, input controls, track numbers, TOSEC-style dump flags, a new YYYY-MM build-date shape, and additional compatibility/distribution aliases.
    • Added smarter companion entry conflict resolution to route duplicate child slugs to the most consistent parent.
  • Bug Fixes
    • Improved system title disambiguation recomputation, including clearing stale results and updating which tag types qualify and their ordering.
    • Prevented an extra indexing status update after cancellation.
    • Ensured dynamically generated track:* tags (including track:-prefixed filename tags) are correctly persisted after indexing.

…pping

Same-name media (e.g. three "Jackal" or "Finalizer" arcade entries) rendered
identically because their distinguishing filename tokens were unparsed or not
eligible for disambiguation.

- Map every distinguishing filename token to a correctly-typed known tag:
  Sega/Capcom/Irem arcade boards, input devices, memory/compatibility, dump
  flags, protection chips, cabinet orientation, build dates, and more.
- Add cabinet (upright/cocktail/cabaret/sitdown) and protection
  (fd1094/fd1089/8751/mc-8123/encrypted/decrypted) tag types.
- Recompute DisambiguationTypes over the full allowlist of eligible tag types
  so presence/absence and value differences across siblings both qualify.
- Rewrite the disambiguation recompute as a single-pass set-based update
  (reset then set-qualifying via CTEs), parity-verified against the prior
  correlated-subquery form; add benchmark coverage.
- Resolve companion gamelist slug conflicts by picking the name-consistent
  parent, fixing Phantasy Star IV showing Phantasy Star III metadata.
- Map spelled-out "(System 16C version)" board ids that were dropped as noise.
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 58056f8a-f540-4394-b8c6-ef7b0ccc4a14

📥 Commits

Reviewing files that changed from the base of the PR and between 949ba6a and c14aa4e.

📒 Files selected for processing (1)
  • pkg/database/mediadb/sql_media_titles.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/database/mediadb/sql_media_titles.go

📝 Walkthrough

Walkthrough

This PR updates disambiguation ordering and recomputation, adds companion child slug conflict handling, and expands filename tag parsing with new arcade, music, and format tag types and mappings.

Changes

Disambiguation, scraper, and tag parsing

Layer / File(s) Summary
Disambiguation ordering and recompute
pkg/database/database.go, pkg/database/mediadb/sql_media_titles.go, pkg/database/mediadb/*
TagTypeDisplayPriority and ZapScriptTagTypes are updated, DisambiguationTypes recomputation is rewritten, and new tests plus a benchmark cover the flow.
Companion slug conflict resolution
pkg/database/scraper/gamelistxml/scraper.go, pkg/database/scraper/gamelistxml/scraper_test.go
ZaparooCompanion child slug handling now filters conflicts by parent-name consistency, tracks dropped slugs, and is validated by new scraper tests.
Tag taxonomy and filename parsing
pkg/database/tags/*, pkg/database/mediascanner/indexing_pipeline.go, pkg/database/mediascanner/indexing_pipeline_test.go, pkg/api/methods/media.go
New cabinet/protection tag types and values are added, mappings expand for arcade and compatibility tokens, filename parsing now routes more bracketed/comma-separated tokens into canonical tags, track: indexing is persisted, and cancellation stops late status updates.

Estimated code review effort: 4 (Complex) | ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: same-name media disambiguation and expanded tag mapping in the database layer.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/mediadb-metadata-scraper-disambiguation

Comment @coderabbitai help to get the list of available commands.

The scanner status callback set indexing=true on every progress update, so a
callback firing after cancel() cleared the flag (but before the scanner
observed the cancelled context) briefly resurrected the running state. Skip
status updates once the context is cancelled. Fixes flaky
TestMediaIndexingCancellation_Integration.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
pkg/database/scraper/gamelistxml/scraper_test.go (1)

2611-2636: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use afero for the gamelist fixture write.

This test writes directly with os.WriteFile; route the fixture through the scraper’s filesystem seam instead.

♻️ Proposed fix
 	root := t.TempDir()
+	fs := afero.NewOsFs()
 	// The same "phantasystar4" slug is listed under two parents. The WRONG parent
 	// (Phantasy Star 3) is listed first to prove file order does not decide the winner;
 	// name consistency must route the slug to the Phantasy Star 4 parent (id 30).
-	require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(`<gameList>
+	require.NoError(t, afero.WriteFile(fs, filepath.Join(root, "gamelist.xml"), []byte(`<gameList>
 ...
 </gameList>`), 0o600))
@@
-	s := &GamelistXMLScraper{db: mockDB}
+	s := &GamelistXMLScraper{db: mockDB, fs: fs}

As per coding guidelines, **/*.go: “Use afero for filesystem operations in testable code.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/database/scraper/gamelistxml/scraper_test.go` around lines 2611 - 2636,
The test in GamelistXMLScraper is writing the gamelist fixture directly with
os.WriteFile instead of using the filesystem abstraction. Update this fixture
setup to use the scraper’s afero-backed filesystem seam (the same test FS used
by GamelistXMLScraper) so the test stays consistent with the rest of the
filesystem-dependent code and guidelines.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/database/mediadb/disambiguation_bench_test.go`:
- Around line 85-96: The benchmark setup in disambiguation_bench_test.go is
interpolating titles into the SQL passed to conn.ExecContext, which triggers
SQL-injection checks even though the value is local. Update the recursive seed
inserts in the ExecContext call to use parameter placeholders instead of
fmt.Sprintf formatting, and pass titles as query arguments while keeping the
MediaTitles and Media insert behavior unchanged. Use the conn.ExecContext call
in the benchmark helper to locate and adjust the query string and arguments.

In `@pkg/database/mediadb/sql_media_titles.go`:
- Around line 377-382: The recompute logic in sql_media_titles.go currently
relies on db.sqlMu but still performs a reset and a set as separate statements,
so readers like attachZapScriptTags and GetZapScriptTagsBySystemAndPath can
observe cleared DisambiguationTypes between them. Update the recompute path to
use a transaction around each chunk, or refactor it into a single statement, and
keep the change localized around the recompute helper that runs the reset/set
sequence.

In `@pkg/database/mediascanner/indexing_pipeline.go`:
- Around line 344-345: Add a focused test for the `track:` dynamic tag path in
`AddMediaPathWithPrefixPolicy` or the nearest indexing test helper. Create a
case that feeds a filename tag starting with `track:` and assert the resulting
`tags.TagTypeTrack` tag is persisted with the expected value and linked to the
media item. Use the existing indexing pipeline symbols around `tagStr`,
`dynType`, and `tags.TagTypeTrack` so the test covers this branch directly and
guards against regressions in dynamic tag creation.

In `@pkg/database/scraper/gamelistxml/scraper.go`:
- Around line 1848-1857: The tie handling in the winner selection logic should
not treat every best-score tie as a valid unmanaged equal-name case. In the
resolution block that computes winner/winnerCount from scores, update the
ambiguity check so ties at score 1 between different parent slugs are discarded
instead of being left unresolved, and only assign conflicts[stem] when there is
a single unambiguous winner. Add a test covering the scraper resolution path
with two distinct parent names that both produce score 1 for the same child slug
to confirm the ambiguous tie is dropped.

In `@pkg/database/tags/filename_parser_test.go`:
- Around line 1467-1509: Add assertions in
TestParseFilenameToCanonicalTagsForMedia_CabinetAndProtection for the newly
introduced mappings so those branches are covered too: exercise
ParseFilenameToCanonicalTagsForMedia with representative filenames that should
produce cabaret, encrypted, decrypted, mc-8123, and lnx tags, and verify the
resulting CanonicalTag.String values contain each expected tag. Keep the
existing test structure and use the same helper patterns in
filename_parser_test.go so the new cases are easy to locate and maintain.

---

Nitpick comments:
In `@pkg/database/scraper/gamelistxml/scraper_test.go`:
- Around line 2611-2636: The test in GamelistXMLScraper is writing the gamelist
fixture directly with os.WriteFile instead of using the filesystem abstraction.
Update this fixture setup to use the scraper’s afero-backed filesystem seam (the
same test FS used by GamelistXMLScraper) so the test stays consistent with the
rest of the filesystem-dependent code and guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6d8b8b12-ed88-488d-a28d-2247702afbe6

📥 Commits

Reviewing files that changed from the base of the PR and between 3d88c5e and 58e8309.

📒 Files selected for processing (14)
  • pkg/database/database.go
  • pkg/database/mediadb/disambiguation_bench_test.go
  • pkg/database/mediadb/disambiguation_test.go
  • pkg/database/mediadb/sql_media_titles.go
  • pkg/database/mediascanner/indexing_pipeline.go
  • pkg/database/scraper/gamelistxml/scraper.go
  • pkg/database/scraper/gamelistxml/scraper_test.go
  • pkg/database/tags/filename_parser.go
  • pkg/database/tags/filename_parser_test.go
  • pkg/database/tags/string_parsers.go
  • pkg/database/tags/tag_mappings.go
  • pkg/database/tags/tag_values.go
  • pkg/database/tags/tags.go
  • pkg/database/tags/tagtypes.go

Comment on lines +85 to +96
_, err = conn.ExecContext(ctx, fmt.Sprintf(`
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
INSERT INTO MediaTitles (DBID, SystemDBID, Slug, Name)
SELECT i, 1, 'game-' || i, 'Game ' || i FROM seq;
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
INSERT INTO Media (DBID, MediaTitleDBID, SystemDBID, Path, IsMissing)
SELECT (i-1)*3 + j, i, 1, 'game-' || i || '-' || j, 0
FROM seq, (SELECT 1 AS j UNION SELECT 2 UNION SELECT 3);
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 1 FROM Media;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 2 FROM Media WHERE (DBID %% 3) = 2;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 3 FROM Media WHERE (DBID %% 3) = 0;
`, titles, titles))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Parameterize the recursive seed query.

titles is local here, but interpolating it into ExecContext is already tripping SQL-injection rules on this hunk. Swapping the two %d slots for ? placeholders keeps the benchmark behavior the same and avoids normalizing a pattern the repo's security checks are supposed to catch. As per coding guidelines, never disable security checks (gosec, govulncheck) or change the forbidigo rules for sync.Mutex/RWMutex.

Suggested change
-	_, err = conn.ExecContext(ctx, fmt.Sprintf(`
-		WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
+	_, err = conn.ExecContext(ctx, `
+		WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < ?)
 		INSERT INTO MediaTitles (DBID, SystemDBID, Slug, Name)
 			SELECT i, 1, 'game-' || i, 'Game ' || i FROM seq;
-		WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
+		WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < ?)
 		INSERT INTO Media (DBID, MediaTitleDBID, SystemDBID, Path, IsMissing)
 			SELECT (i-1)*3 + j, i, 1, 'game-' || i || '-' || j, 0
 			FROM seq, (SELECT 1 AS j UNION SELECT 2 UNION SELECT 3);
 		INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 1 FROM Media;
 		INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 2 FROM Media WHERE (DBID %% 3) = 2;
 		INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 3 FROM Media WHERE (DBID %% 3) = 0;
-	`, titles, titles))
+	`, titles, titles)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_, err = conn.ExecContext(ctx, fmt.Sprintf(`
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
INSERT INTO MediaTitles (DBID, SystemDBID, Slug, Name)
SELECT i, 1, 'game-' || i, 'Game ' || i FROM seq;
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < %d)
INSERT INTO Media (DBID, MediaTitleDBID, SystemDBID, Path, IsMissing)
SELECT (i-1)*3 + j, i, 1, 'game-' || i || '-' || j, 0
FROM seq, (SELECT 1 AS j UNION SELECT 2 UNION SELECT 3);
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 1 FROM Media;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 2 FROM Media WHERE (DBID %% 3) = 2;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 3 FROM Media WHERE (DBID %% 3) = 0;
`, titles, titles))
_, err = conn.ExecContext(ctx, `
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < ?)
INSERT INTO MediaTitles (DBID, SystemDBID, Slug, Name)
SELECT i, 1, 'game-' || i, 'Game ' || i FROM seq;
WITH RECURSIVE seq(i) AS (SELECT 1 UNION ALL SELECT i+1 FROM seq WHERE i < ?)
INSERT INTO Media (DBID, MediaTitleDBID, SystemDBID, Path, IsMissing)
SELECT (i-1)*3 + j, i, 1, 'game-' || i || '-' || j, 0
FROM seq, (SELECT 1 AS j UNION SELECT 2 UNION SELECT 3);
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 1 FROM Media;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 2 FROM Media WHERE (DBID %% 3) = 2;
INSERT INTO MediaTags (MediaDBID, TagDBID) SELECT DBID, 3 FROM Media WHERE (DBID %% 3) = 0;
`, titles, titles)
🧰 Tools
🪛 OpenGrep (1.23.0)

[ERROR] 85-96: SQL query built via fmt.Sprintf or string concatenation passed to a database method. Use parameterized queries with placeholder arguments.

(coderabbit.sql-injection.go-query-format)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/database/mediadb/disambiguation_bench_test.go` around lines 85 - 96, The
benchmark setup in disambiguation_bench_test.go is interpolating titles into the
SQL passed to conn.ExecContext, which triggers SQL-injection checks even though
the value is local. Update the recursive seed inserts in the ExecContext call to
use parameter placeholders instead of fmt.Sprintf formatting, and pass titles as
query arguments while keeping the MediaTitles and Media insert behavior
unchanged. Use the conn.ExecContext call in the benchmark helper to locate and
adjust the query string and arguments.

Sources: Coding guidelines, Linters/SAST tools

Comment thread pkg/database/mediadb/sql_media_titles.go Outdated
Comment on lines +344 to +345
case strings.HasPrefix(tagStr, string(tags.TagTypeTrack)+":"):
dynType = string(tags.TagTypeTrack)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add a focused test for track: indexing.

This branch adds new persisted behavior, but the new tests in this cohort only exercise recompute logic. A small AddMediaPathWithPrefixPolicy case that feeds a track:* filename tag and asserts the track tag type/value is created and linked would catch regressions in the dynamic tag path. As per coding guidelines, **/*.go: Write tests for all new code — see TESTING.md and pkg/testing/README.md.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/database/mediascanner/indexing_pipeline.go` around lines 344 - 345, Add a
focused test for the `track:` dynamic tag path in `AddMediaPathWithPrefixPolicy`
or the nearest indexing test helper. Create a case that feeds a filename tag
starting with `track:` and assert the resulting `tags.TagTypeTrack` tag is
persisted with the expected value and linked to the media item. Use the existing
indexing pipeline symbols around `tagStr`, `dynType`, and `tags.TagTypeTrack` so
the test covers this branch directly and guards against regressions in dynamic
tag creation.

Source: Coding guidelines

Comment thread pkg/database/scraper/gamelistxml/scraper.go Outdated
Comment thread pkg/database/tags/filename_parser_test.go
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

… handling

- Collapse the disambiguation recompute reset+set into one UPDATE...FROM so
  GetZapScriptTagsBySystemAndPath, which reads DisambiguationTypes without
  sqlMu, cannot observe the transient blank state between statements under the
  single-connection pool.
- Drop companion slug conflicts that tie on prefix-only (score 1) name matches:
  differently-named parents do not share metadata. Exact-name (score 2) ties
  stay first-wins.
- Add tests for the track dynamic tag type and the cabaret/encrypted/decrypted/
  mc-8123/lnx mappings.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/database/mediadb/sql_media_titles.go (1)

423-470: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Make the group_concat order explicit
group_concat(tag) and group_concat(typ, ',') still depend on subquery order here; use SQLite’s ordered-aggregate syntax so vs and DisambiguationTypes stay stable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/database/mediadb/sql_media_titles.go` around lines 423 - 470, The
`setQuery` CTEs still rely on subquery ordering for `group_concat`, so the
values in `mvs` (`group_concat(tag) AS vs`) and `grp` (`group_concat(typ, ',')
AS types`) can vary across runs. Update the SQL in `sql_media_titles.go` to use
SQLite’s ordered-aggregate form inside the `mvs` and `grp` queries so the
concatenation order is explicit and `DisambiguationTypes` remains deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/database/mediadb/sql_media_titles.go`:
- Around line 423-470: The `setQuery` CTEs still rely on subquery ordering for
`group_concat`, so the values in `mvs` (`group_concat(tag) AS vs`) and `grp`
(`group_concat(typ, ',') AS types`) can vary across runs. Update the SQL in
`sql_media_titles.go` to use SQLite’s ordered-aggregate form inside the `mvs`
and `grp` queries so the concatenation order is explicit and
`DisambiguationTypes` remains deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c5ec5ac9-59fc-4474-a55f-719211cbee5e

📥 Commits

Reviewing files that changed from the base of the PR and between 58e8309 and 949ba6a.

📒 Files selected for processing (6)
  • pkg/api/methods/media.go
  • pkg/database/mediadb/sql_media_titles.go
  • pkg/database/mediascanner/indexing_pipeline_test.go
  • pkg/database/scraper/gamelistxml/scraper.go
  • pkg/database/scraper/gamelistxml/scraper_test.go
  • pkg/database/tags/filename_parser_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/database/scraper/gamelistxml/scraper.go
  • pkg/database/scraper/gamelistxml/scraper_test.go
  • pkg/database/tags/filename_parser_test.go

Use SQLite ordered-aggregate group_concat in the recompute CTEs so the
concatenation order no longer depends on subquery ORDER BY propagating into
the aggregate (which SQLite does not guarantee). mvs.vs orders by tag, keeping
COUNT(DISTINCT vs) from treating an identical tag set as distinct; grp.types
orders by type so the stored DisambiguationTypes string stays deterministic.
@wizzomafizzo wizzomafizzo merged commit 71fa917 into main Jul 1, 2026
15 checks passed
@wizzomafizzo wizzomafizzo deleted the fix/mediadb-metadata-scraper-disambiguation branch July 1, 2026 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant