Medical RAG Pipeline with Multi-Hop Search by iberi22 · Pull Request #44 · iberi22/isar_agent_memory

iberi22 · 2026-04-16T22:32:39Z

Implemented a specialized medical RAG pipeline to address complex medical questions and multi-hop retrieval.

Key features:

Query Normalization: Expands common medical abbreviations (e.g., "TA" -> "tensión arterial").
Query Decomposition: Uses an LLM to split compound queries into sub-questions for more accurate multi-hop search.
Hybrid Retrieval: Merges vector and text search results across all sub-queries.
Re-Ranking Support: Integrated a re-ranking stage into the pipeline.
Evidence Citation: Automatically identifies and cites the source nodes used in the generated response.
Safety: Includes mandatory medical disclaimers in the final output.

All new components are thoroughly tested and follow the existing design patterns in the repository.

Fixes #39

PR created automatically by Jules for task 11294999718864016196 started by @iberi22

- Added `MemoryPipeline` and `PipelineStage` base classes for modular RAG flows. - Implemented `MedicalRagPipeline` with stages for normalization, decomposition, hybrid retrieval, re-ranking, and citation. - Created `MedicalPromptBuilder` for medical-specific prompts and safety disclaimers in Spanish. - Added comprehensive unit and integration tests in `test/medical_rag_pipeline_test.dart`. - Exported new components in `lib/isar_agent_memory.dart`.

google-labs-jules · 2026-04-16T22:32:40Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

coderabbitai · 2026-04-16T22:32:47Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db4c65fa-e010-4fd8-862c-4e1ad8cdfa3b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/medical-rag-pipeline-11294999718864016196

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a medical-specific RAG pipeline, including stages for query normalization, decomposition, hybrid retrieval, and evidence citation. Feedback focuses on improving the robustness of citation matching to avoid false positives, optimizing performance by pre-compiling regex patterns and parallelizing retrieval tasks, and enhancing multi-language support by removing hardcoded language triggers in the decomposition stage.

gemini-code-assist · 2026-04-16T22:34:35Z

+    for (var i = 0; i < context.retrievedNodes.length; i++) {
+      final node = context.retrievedNodes[i].node;
+      final citation = '[${i + 1}]';
+      if (context.generatedResponse!.contains(citation)) {
+        cited.add(node);
+      }
+    }


The citation matching logic using String.contains is prone to false positives. For example, a check for [1] will return true if the response contains [10], [11], etc. This results in incorrect evidence attribution. Using a regular expression with a negative lookahead for digits ensures that only the exact citation is matched.

Suggested change

for (var i = 0; i < context.retrievedNodes.length; i++) {

final node = context.retrievedNodes[i].node;

final citation = '[${i + 1}]';

if (context.generatedResponse!.contains(citation)) {

cited.add(node);

}

}

for (var i = 0; i < context.retrievedNodes.length; i++) {

final node = context.retrievedNodes[i].node;

final citation = '[${i + 1}]';

// Use regex with negative lookahead to ensure exact match (e.g., avoid matching [1] in [10])

final pattern = RegExp(RegExp.escape(citation) + r'(?!\d)');

if (context.generatedResponse!.contains(pattern)) {

cited.add(node);

}

}

gemini-code-assist · 2026-04-16T22:34:35Z

+  @override
+  Future<RagContext> process(RagContext context) async {
+    String normalized = context.currentQuery;
+    abbreviations.forEach((abbrev, expansion) {
+      // Simple regex-like replacement for whole words
+      final regex = RegExp('\\b$abbrev\\b', caseSensitive: false);
+      normalized = normalized.replaceAll(regex, expansion);
+    });
+    context.currentQuery = normalized;
+    return context;
+  }
+}


Creating a new RegExp instance for every abbreviation on every call to process is inefficient. Since the abbreviations are static, these patterns should be pre-compiled to improve performance, especially if this stage is called frequently.

static final Map<String, RegExp> _regexCache = abbreviations.map( (k, v) => MapEntry(k, RegExp(r'\b' + k + r'\b', caseSensitive: false)), ); @override Future<RagContext> process(RagContext context) async { String normalized = context.currentQuery; _regexCache.forEach((abbrev, regex) { normalized = normalized.replaceAll(regex, abbreviations[abbrev]!); }); context.currentQuery = normalized; return context; }

gemini-code-assist · 2026-04-16T22:34:35Z

+        context.currentQuery.contains(' y ') ||
+        context.currentQuery.contains(' e ')) {


The decomposition logic relies on hardcoded Spanish conjunctions (' y ', ' e '). This limits the pipeline's effectiveness in other languages, despite other parts of the code attempting to support English. Consider making these triggers language-aware.

gemini-code-assist · 2026-04-16T22:34:35Z

+    for (final q in context.decomposedQueries) {
+      final results = await memoryGraph.hybridSearch(q, topK: 5, alpha: 0.5);
+      for (final res in results) {
+        final existing = allResults[res.node.id];
+        if (existing == null || res.score > existing.score) {
+          allResults[res.node.id] = res;
+        }
+      }
+    }


The sub-queries generated during decomposition are processed sequentially. This can lead to high latency in the retrieval stage. Using Future.wait to execute these searches in parallel would significantly improve performance.

Suggested change

for (final q in context.decomposedQueries) {

final results = await memoryGraph.hybridSearch(q, topK: 5, alpha: 0.5);

for (final res in results) {

final existing = allResults[res.node.id];

if (existing == null || res.score > existing.score) {

allResults[res.node.id] = res;

}

}

}

final searchFutures = context.decomposedQueries.map(

(q) => memoryGraph.hybridSearch(q, topK: 5, alpha: 0.5),

);

final resultsList = await Future.wait(searchFutures);

for (final results in resultsList) {

for (final res in results) {

final existing = allResults[res.node.id];

if (existing == null || res.score > existing.score) {

allResults[res.node.id] = res;

}

}

}

iberi22 · 2026-04-16T22:41:48Z

CORRECCION DE SCOPE: Por favor elimina el archivo medical_rag_pipeline.dart del PR. En su lugar crea lib/src/rag/pipeline_hooks.dart con interfaces genéricas:

ReRankingHook (abstract class, sin implementación médica)
QueryExpansionHook (abstract class)
CitationHook (abstract class)

El PR puede usar MemoryPipeline como base, pero sin lógica médica hardcodeada.

iberi22 · 2026-04-16T23:40:59Z

SCOPE ISSUE: Este PR tiene lógica médica que NO pertenece a isar_agent_memory (package genérico).

LO QUE HAY QUE ELIMINAR:

lib/src/rag/medical_rag_pipeline.dart → eliminar
lib/src/rag/medical_prompt_builder.dart → eliminar
QueryNormalizationStage con 'TA', 'FC', 'DM' hardcoded → eliminar
MedicalDisclaimer en respuesta → eliminar

LO QUE DEBE QUEDAR:

lib/src/rag/memory_pipeline.dart (genérico, SIN lógica médica)
Los tests asociados

La lógica médica específica (RAG pipeline, prompt builder, abbreviations) debe vivir en OrionHealth, no aquí.

Por favor: elimina los archivos médicos y actualiza el PR.

iberi22 · 2026-04-17T00:13:41Z

SCOPE VIOLATION: Este PR tiene logica medica hardcodeada que NO pertenece a un package generico. Por favor crea nuevo PR solo con memory_pipeline.dart generico.

google-labs-jules Bot mentioned this pull request Apr 16, 2026

Feat: Medical RAG Pipeline with Multi-Hop Search #39

Open

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

iberi22 closed this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Medical RAG Pipeline with Multi-Hop Search#44

Medical RAG Pipeline with Multi-Hop Search#44
iberi22 wants to merge 1 commit into
mainfrom
feat/medical-rag-pipeline-11294999718864016196

iberi22 commented Apr 16, 2026

Uh oh!

google-labs-jules Bot commented Apr 16, 2026

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

iberi22 commented Apr 16, 2026

Uh oh!

iberi22 commented Apr 16, 2026

Uh oh!

iberi22 commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		context.currentQuery.contains(' y ') \|\|
		context.currentQuery.contains(' e ')) {

Conversation

iberi22 commented Apr 16, 2026

Uh oh!

google-labs-jules Bot commented Apr 16, 2026

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

iberi22 commented Apr 16, 2026

Uh oh!

iberi22 commented Apr 16, 2026

Uh oh!

iberi22 commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant