feat: support omop emb 0.5.0#9
Conversation
|
in README it mentions ClassIDEnum.HIERARCHICAL, but it should be ClassIDEnum.HIERARCHY also stale refs: rank_paths, kg.find_shortest_paths, and kg.rank_paths |
|
in paths.py: reconstruct_paths makes all nodes standard=False --> path metadata is wrong |
|
EdgeView.from_query() depends on positional column order rather than row names |
|
if you add synonym (bool) field to LabelMatch and populate from KnowledgeGraph.concept_lookup(), then LabelMatchGroupView can faithfully return direct/synonym --> at the moment it's only returning exact/partial/fulltext/embedding |
|
LabelMatchGroupView.from_matches assumes pre-sorting rather than explicit rule (or - stop using ms[0] as 'best' by definition) |
|
in paths.py should we rename 'num_hops' in find_standard_paths to max_standard_hops or something? this is very specifically only traversing standard-standard relationships alternatively: could add in param allow_non_standard_intermediates: bool = False reason is that for MCP exploration/view use case, users will likely expect broader graph navigation |
|
find_standard_paths drops edges when multiple outgoing edges point to the same ID this is an issue for concepts joined by >1 valid relationship select * from (select concept_id_1, concept_id_2, count(distinct(relationship_id)) as c fr |
|
PathProfile.from_path() cannot handle valid empty-path case returned when Profiling or explaining a zero-hop path will fail at runtime instead of returning a sensible self-match profile |
|
|
traverse()
|
|
Scoring docs say relevance is composite, but I think this is intentional but perhaps having some kind of parameter to override this is worthwhile? |
| import pandas as pd | ||
| import os | ||
| from pathlib import Path | ||
|
|
There was a problem hiding this comment.
requires OMOP_VOCABULARY_DIR to be set even if you have existing DB and don't want to import new vocab
There was a problem hiding this comment.
requires OMOP_CDM_DB_DRIVER etc... even if you have the full engine string
There was a problem hiding this comment.
Regarding the first comment: Yes as we store the new stuff there to have it all in one. But maybe there is a better way whether we don't need to store it on file or store it in a TempFile. Open for discussion.
Regarding 2: Has this been tested? Cause I have not experienced this but probably need to properly investigate this.
| populate_test_data(session) | ||
|
|
||
| @app.command() | ||
| def relationship_classification( |
There was a problem hiding this comment.
if using postgres but not installing pgvector (valid, as "psycopg[binary]>=3.1.0" in core deps), the psycopg binary does not support current load functionality - needs psycopg2
uv run omop-graph relationship-classification --pred-class-dir ./docs
2026-05-13 10:23:04 | orm_loader.loaders.loading_helpers | ERROR | Error during bulk load via COPY: 'Cursor' object has no attribute 'copy_expert'
2026-05-13 10:23:04 | orm_loader.tables.loadable_table | WARNING | COPY failed for _staging_relationship_class: 'Cursor' object has no attribute 'copy_expert'
2026-05-13 10:23:04 | orm_loader.loaders.loading_helpers | ERROR | Error during bulk load via COPY: 'Cursor' object has no attribute 'copy_expert'
2026-05-13 10:23:04 | orm_loader.tables.loadable_table | WARNING | COPY failed for _staging_relationship_mapping: 'Cursor' object has no attribute 'copy_expert'
There was a problem hiding this comment.
Requires further investigation on my side as I did not yet encounter the error but maybe because I accidentally had the old psycopg installed.
| self.session_factory = sessionmaker(bind=self.cdm_engine, future=True) | ||
|
|
||
| # Populate the relationshipcache | ||
| with self.session_factory() as session: |
There was a problem hiding this comment.
either have a fallback on this compulsory import of relationship tables, or document clearly that the graph doesn't support concept-only usage (I think the latter but either way?)
There was a problem hiding this comment.
can't really think why you'd want to use concept_view(), concept_lookup(), or concept_id_by_code() without the relationships functionality - this is too heavy for that I suspect?
There was a problem hiding this comment.
Decided to give a better error message.
…flect its purpose
Motivation
omop-embintroduces a new storage and DB concept with significant and breaking changes to support a local-first and backend-agnostic storage solution for the embeddings. To include these changes and fixes that come with a successful PR, we need to prepareomop-graphfor the new incoming interfaces etc.Closes #8