feat(rag): expose reasoning trace via public run_with_trace()#39
Conversation
|
✅ DCO Check Passed Thanks @pjmalandrino, all your commits are properly signed off. 🎉 |
Merge Protections🟢 All 2 merge protections satisfied — ready to merge. Show 2 satisfied protections🟢 Enforce conventional commitMake sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
🟢 Require two reviewer for test updatesWhen test data is updated, we require two reviewers
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Add RAGTrace model (query, per_document, final_answer) and a public run_with_trace() method on DoclingRAGAgent returning the full per-iteration trace that was previously discarded inside run(). run() becomes a thin wrapper around run_with_trace(): single source of truth for the loop, no behavior change for existing callers. RAGTrace, RAGResult and RAGIteration are re-exported at package level so downstream consumers can type-hint against the trace. Closes docling-project#26 Signed-off-by: Pier-Jean Malandrino <pierjean.malandrino@scub.net>
32622b4 to
f72f877
Compare
|
Hey @PeterStaar-IBM / @ceberam , any update on this ? This is necessary for my visualization layer on Docling-Studio. |
ceberam
left a comment
There was a problem hiding this comment.
Thanks @pjmalandrino for suggesting this PR.
I'm happy with the design and implementation of this feature. Actually, we thought of adding a similar tracing capability across all agents that could be exported to a file for debugging purposes. However, I'm fine with this solution for the RAG case for now, specially if it helps with your integration.
…a file Generalize the run_with_trace pattern from docling-project#39 into a generic, typed AgentTrace that every agent produces, and let the orchestrator compose the traces of the sub-agents it dispatches to into a tree that can be exported to a single JSON file for debugging. - New AgentStep / AgentTrace value objects (agent_trace.py). AgentTrace carries ordered steps, nested children (sub-agent traces), timing, model id and the produced document on `output` (excluded from serialization; result_name is the persisted pointer). SerializeAsAny on children preserves subclass fields. - BaseDoclingAgent.run_with_trace(): concrete default so every agent exposes a trace (timing + model + result) with no bespoke code; agents override to add steps. run() stays the source of truth for the document. - RAGTrace becomes a subclass of AgentTrace, so docling-project#39 is preserved (same fields, same construction, covariant return) and a RAG run nests into the tree. DoclingRAGAgent builds the answer doc onto output; run() returns it. - DoclingOrchestratorAgent.run_task_with_trace() composes the tree by recording each dispatched sub-agent trace; run_task() is unchanged and incurs no overhead. - LoggingConfig.trace_path + CLI export. Public re-exports: AgentTrace, AgentStep. No global state, no logging coupling: the trace is a returned value object. Design doc: docs/design/37-export-session-trace.md. Closes docling-project#37 Signed-off-by: Pier-Jean Malandrino <pierjean.malandrino@scub.net>
Add RAGTrace model (query, per_document, final_answer) and a public run_with_trace() method on DoclingRAGAgent returning the full per-iteration trace that was previously discarded inside run().
run() becomes a thin wrapper around run_with_trace(): single source of truth for the loop, no behavior change for existing callers.
RAGTrace, RAGResult and RAGIteration are re-exported at package level so downstream consumers can type-hint against the trace.
Closes #26