fix: Reduce synthetic single-parent FAMS in GEDCOM roundtrip#581
fix: Reduce synthetic single-parent FAMS in GEDCOM roundtrip#581isaacschepp wants to merge 26 commits intomainfrom
Conversation
…thetic ones When a child has only one parent-child relationship and that parent's families all have pair-matched children, the export algorithm now uses the first family as best-effort placement instead of creating a synthetic single-parent FAM. This fixes the +124 FAMS surplus in the queen test file roundtrip. Trade-off: children from genuinely different unions (father has children outside marriage) will now be placed in the marriage FAM rather than a separate synthetic FAM. This is less accurate for that edge case but eliminates spurious FAMS references that break roundtrip fidelity for the common case (missing parent-link, not different union). Fixes #486
Deploying genealogix with
|
| Latest commit: |
4295423
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://6683e075.genealogix.pages.dev |
| Branch Preview URL: | https://fix-fams-synthetic-family-ro.genealogix.pages.dev |
There was a problem hiding this comment.
Pull request overview
Reduces spurious synthetic single-parent FAM records produced during GLX → GEDCOM export, improving GED → GLX → GED roundtrip fidelity by preferentially placing single-parent children into an existing spouse family when appropriate (fix for #486).
Changes:
- Adjusted
reconstructFamiliesfallback logic to use an existing family (best-effort) instead of creating a synthetic single-parent family in the “all families skipped due to paired children” scenario. - Updated an existing single-parent reconstruction test to reflect the new (roundtrip-fidelity-biased) behavior.
- Added a new regression test covering the missing-second-parent-link case that previously triggered synthetic family creation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| go-glx/gedcom_export_family.go | Updates family matching fallback to avoid creating synthetic single-parent families when a suitable existing family is available. |
| go-glx/gedcom_export_family_test.go | Updates expectations for the changed behavior and adds a regression test for #486. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot correctly noted the skip-then-fallback was redundant. Simplified to gate the continue on len(familyIndices) > 1 — when a parent has only one family, always place the child there. Comment updated to match.
|
Both Copilot items addressed in d6289c4:
|
|
The queen.ged test file is at The "+124 FAMS" number came from the original todo.md — it was a manual measurement from an earlier version of the import/export code. There is no automated E2E test that roundtrips queen.ged and compares FAMS counts. The unit test in this PR validates the algorithm fix at the glx import glx/testdata/gedcom/5.5.1/large-files/queen.ged -o /tmp/queen.glx
glx export /tmp/queen.glx -o /tmp/queen-rt.ged
# Compare FAMS counts
grep -c '1 FAMS' glx/testdata/gedcom/5.5.1/large-files/queen.ged
grep -c '1 FAMS' /tmp/queen-rt.gedIf the counts don't match, the remaining gap may be from a different root cause than the one this PR fixes (e.g., the ASSO import in #589 may also affect participant reconstruction). Happy to add an E2E roundtrip FAMS count assertion if you want. |
Review — tested against all GEDCOM test files + westeros archiveI rebuilt both
The PR produces byte-identical output for every GEDCOM roundtrip test. The +124 FAMS number from the todo.md appears to be stale — the current import/export code doesn't exhibit that surplus on any of these files. Where it does have an effect: hand-built GLX archivesExporting the westeros archive (not a GEDCOM roundtrip — entities were created directly in GLX):
The 2 eliminated families are Oberyn Martell's Sand Snakes. On main, Oberyn has two FAMs:
On the PR branch, all 8 children are merged into Why the roundtrips are unaffectedThe GEDCOM importer creates two parent-child relationships per child when the FAM has both HUSB and WIFE. So after a roundtrip, every child that was in a two-parent FAM has two parent links, and the pair-matching in RecommendationI'd hold this PR. The bug it targets doesn't reproduce on current code, and the behavioral change it introduces is a net negative for archives with legitimate single-parent children (like Oberyn's bastards). The +124 number was probably valid against an older version of the import/export code but has since been fixed by other changes. |
What and why
Fixes the +124 FAMS surplus in the queen test file and +22 in british-royalty during GED → GLX → GED roundtrip. The root cause: when a child has only one parent-child relationship (the other parent's link wasn't imported), the GEDCOM export created a synthetic single-parent FAM instead of placing the child in the existing marriage FAM.
The fix
In
buildFamilyGroups, when a single-parent child's parent has families but all were skipped (because they have pair-matched children), use the first family as best-effort placement instead of falling through tocreateSyntheticFamily.Before: child with missing father-link → skip marriage FAM (has paired children) → create synthetic FAM → extra FAMS on mother
After: child with missing father-link → skip marriage FAM → no unpaired family found → use first family (the marriage) → no extra FAMS
Trade-off
Children from genuinely different unions (e.g., father has children outside marriage) will now be placed in the marriage FAM rather than a separate synthetic FAM. The algorithm cannot distinguish "missing parent-link" from "different union" without metadata. This trade-off favors roundtrip fidelity (the common case) over the edge case.
Related issues
Fixes #486
Testing
TestReconstructFamilies_SingleParentChildJoinsExistingFamily— verifies princess joins existing marriage FAM instead of getting synthetic oneTestReconstructFamilies_SingleParentWithExistingFamily— expectations updated to match new behaviorBreaking changes
None. GEDCOM export produces fewer spurious FAM records, which is strictly an improvement for roundtrip fidelity.