Skip to content

Add FalkorDB Knowledge Graph with GraphRAG for LEGI, JADE, and BOFIP#22

Closed
Copilot wants to merge 2 commits into
mainfrom
copilot/setup-falkordb-graphrag-knowledge-graph
Closed

Add FalkorDB Knowledge Graph with GraphRAG for LEGI, JADE, and BOFIP#22
Copilot wants to merge 2 commits into
mainfrom
copilot/setup-falkordb-graphrag-knowledge-graph

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 16, 2026

Adds FalkorDB as a graph database to build a Knowledge Graph over processed legal data (LEGI, JADE, BOFIP), with GraphRAG SDK integration for entity-aware Q&A.

Design decision: Option B — parallel population during processing

Graph nodes and relationships are written to FalkorDB in the same pass as PostgreSQL inserts, while the raw XML is being parsed. This avoids a second ETL pass and preserves structural metadata (e.g. LIENS cross-references) only available at parse time. All graph ops are wrapped in try/except — FalkorDB unavailability never interrupts the main pipeline.

Infrastructure

  • docker-compose.yml: Add falkordb service (falkordb/falkordb:v4.4.4) with healthcheck, named volume, port 6379, and Airflow depends_on.
  • .env.example / config/config.py / config/__init__.py: Add FALKORDB_HOST, FALKORDB_PORT, FALKORDB_USERNAME, FALKORDB_PASSWORD, FALKORDB_GRAPH_NAME.
  • pyproject.toml: Add falkordb==1.6.0 and graphrag-sdk==0.8.2.

Graph ontology (database/graph_manage.py)

Node Source
LegalText LEGI articles
JudicialDecision JADE decisions
TaxGuidance BOFIP guidance
LegalCode Code civil, Code pénal, …
Ministry Issuing ministries
Jurisdiction Administrative courts
Chunk Text chunks (GraphRAG)

Relationships: BELONGS_TO_CODE, ISSUED_BY, REFERENCES (from LIENS metadata), DECIDED_BY, PART_OF.

Key functions

  • init_graph_schema() — creates FalkorDB indexes
  • upsert_{legi,jade,bofip}_{node,chunk}() — parallel-population upserts
  • populate_graph_from_postgres() — streaming back-fill from existing PostgreSQL data (server-side cursors)
  • build_graphrag_knowledge_graph() — instantiates a graphrag_sdk.KnowledgeGraph with the full ontology for entity extraction / Q&A

Processing pipeline (files_processing.py)

Graph upserts added after each PostgreSQL insert for LEGI and JADE handlers. Added a new BOFIPTEXT*.xml handler (matching the existing DILA XML pattern) and a "bofip" type branch in process_data() to handle the BOFIP tgz archives.

# After PostgreSQL insert in LEGI handler
upsert_legi_node(doc_id=cid, nature=nature, category=category,
                 ministry=ministry, links=links, ...)
for chunk_id, *_ in data_to_insert:
    upsert_legi_chunk(chunk_id=chunk_id, doc_id=cid)

BOFIP PostgreSQL table

Added BOFIP table schema to database_manage.py (mirrors the LEGI/JADE pattern with chunk_id, doc_id, nature, category, title, date, text, chunk_text, vector embedding column).

Airflow DAG (airflow_config/dags/knowledge_graph.py)

New KNOWLEDGE_GRAPH DAG with two tasks: init_graph_schema → populate_graph_from_postgres, for bootstrapping or back-filling existing deployments.

Original prompt

This section details on the original issue you should resolve

<issue_title>Falkdor DB setup with GraphRAG and Knowledge graph</issue_title>
<issue_description>This is a critical setup so be accurate and analyze well before.
We are going to add FalkorDB to the docker compose first.

Then based on a deep understanding of the processing of JADE, BOFIP and LEGIT, we will add nodes and relationships to a FalkorDB graph.

We then aim to build an Knowledge graph based on the data we processed. Here you need a judment call: Either we leverage data already processed from the database , OR, we populate the Graph when we are processing the raw data, in parallel to populating the postgresDB.

Once you have the onlogy and graph, you will design a Knowledge graph. I suggest this SDK, but you are free
https://github.com/FalkorDB/GraphRAG-SDK/

Take all your time for the design.</issue_description>

Comments on the Issue (you are @copilot in this section)


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

Co-authored-by: yawo <981127+yawo@users.noreply.github.com>
Copilot AI changed the title [WIP] Add FalkorDB setup with GraphRAG and knowledge graph Add FalkorDB Knowledge Graph with GraphRAG for LEGI, JADE, and BOFIP Mar 16, 2026
Copilot AI requested a review from yawo March 16, 2026 17:44
Copilot finished work on behalf of yawo March 16, 2026 17:44
Copy link
Copy Markdown
Owner

@yawo yawo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close this

Copy link
Copy Markdown
Author

Copilot AI commented Mar 19, 2026

Close this

I don't have the ability to close pull requests directly. You can close it from the GitHub UI by clicking "Close pull request" at the bottom of the PR page.

@yawo yawo closed this Mar 19, 2026
Copilot AI requested a review from yawo March 19, 2026 08:51
Copilot stopped work on behalf of yawo due to an error March 19, 2026 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Falkdor DB setup with GraphRAG and Knowledge graph

2 participants