Skip to content

fix: add_term_to_value_set creates duplicate ValueSet nodes when called before batch_upsert_value_sets #111

@deanban

Description

@deanban

Describe the bug
upsert_value_set MERGEs :ValueSet by column_ref, while add_term_to_value_set MERGEs by name only. If add_term_to_value_set runs before batch_upsert_value_sets, the first MERGE creates a :ValueSet with no column_ref, and the later MERGE then creates a second node with the same name. The graph ends up with duplicate :ValueSet nodes.

To reproduce

  1. Run upsert_decoded_values on assertions that include HAS_DECODED_VALUE.
  2. Query: MATCH (vs:ValueSet) RETURN vs.name, count(*) — duplicates appear for any ValueSet that had decoded terms.

Expected behavior
One :ValueSet per (name, column_ref) after a build.

Environment

  • Affects all environments
  • src/sema/graph/materializer_utils.py:upsert_decoded_values

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions