Skip to content

Question regarding the implementation of Hierarchical Clustering and U-Retrieval algorithm #26

@zhangjh00914

Description

@zhangjh00914

Hello, thank you very much for your work on Medical Graph RAG; the paper is excellent. I am reading the source code and trying to reproduce the experimental results in the paper, but I have found some inconsistencies when comparing the paper's description and the code implementation, and I would like to ask you about them.
1.Regarding hierarchical clustering: I noticed that the repository contains code for nano_graphrag (which implements Leiden clustering), but in the main logic of run.py (Standard Mode / else branch), it seems that these clustering algorithms are not called to build the semantic tree.
2.Regarding the retrieval logic (U-Retrieval): The core U-Retrieval in the paper is described as a top-down navigation retrieval based on a tree structure. However, in the seq_ret function of retrieve.py, I see that the logic is to retrieve all Summary nodes from the database, then loop through them and call LLM for scoring. This looks more like a full linear scan based on LLM, rather than the tree-based retrieval algorithm described in the paper.
I would like to ask: Is the currently open-source code a simplified demo? Are there plans to open-source the "12-layer dynamic clustering tree" construction and the actual U-Retrieval navigation code described in the paper? Or how can I reproduce the efficiency described in the paper based on the current code?
look forward to your reply, thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions