Skip to content

fix: resolve confirmed tools by direct execution to prevent LLM argument drift#972

Draft
mkoruszowic wants to merge 8 commits into
mainfrom
mk/fix-confirmation-flow-969
Draft

fix: resolve confirmed tools by direct execution to prevent LLM argument drift#972
mkoruszowic wants to merge 8 commits into
mainfrom
mk/fix-confirmation-flow-969

Conversation

@mkoruszowic
Copy link
Copy Markdown
Collaborator

@mkoruszowic mkoruszowic commented Apr 19, 2026

Summary

Closes #969.

The existing confirmation flow asks the LLM to regenerate the tool call on the continuation turn after a user approves it. Because the LLM often produces semantically-identical but textually-different arguments (reordered nested keys, whitespace, rephrased free-text, 1.0 vs 1), the freshly computed confirmation_id hash no longer matches the one the user approved — causing an infinite confirmation loop or a silent termination.

This PR adds two complementary fixes:

1. Tool-name fallback in HookManager (short-term band-aid) — when the exact confirmation_id match fails, fall back to matching by tool_name. Keeps existing flows unblocked while the proper fix rolls out. Will be removed in a follow-up PR once consumers adopt (2).

2. Direct tool execution from the chat layer (proper fix) — instead of the LLM re-emitting the tool call, the chat layer:

  • Persists the pending confirmation (tool_name, arguments, tool_call_id) in ChatContext.state["pending_confirmations"] when it streams a ConfirmationRequest.
  • On the continuation turn, calls the new Agent.execute_tool_directly(...) with the stored original arguments, bypassing LLM regeneration entirely.
  • Injects a synthetic (assistant tool_use, tool result) pair into the same agent's history (mutated in place by resolve_pending_confirmations — no second agent, no extra run), so the LLM continues from the executed result without ever re-deciding the call.
  • For declined confirmations, injects a "❌ User declined this action" tool result so the LLM can respond appropriately.

Because we replay the original arguments verbatim, the PRE_TOOL confirmation hook's confirmation_id hash matches by construction — no special bypass logic needed. Non-confirmation PRE_TOOL hooks (validation, access control) still run as before, and a deny decision still blocks execution.

Chat-layer usage (one agent, minimal glue)

agent = Agent(llm=..., tools=..., hooks=..., history=history)

# Mutates agent.history in place with (tool_use, tool_result) pairs for any
# confirmed/declined entries from the previous turn. Returns UI responses to yield.
for response in await self.resolve_pending_confirmations(agent, context):
    yield response

# Same agent, same run — LLM continues from the injected tool results.
async for response in agent.run_streaming(message, context=agent_context):
    ...
    case ConfirmationRequest():
        # Persist pending so the next turn can resolve it directly.
        yield self.create_state_update(self.create_pending_confirmation_state(response))
        yield ConfirmationRequestResponse(content=...)

New public API

  • Agent.execute_tool_directly(tool_call_id, tool_name, arguments, context) -> ToolCallResult
  • inject_tool_call(history, tool_call_id, tool_name, arguments, result) -> ChatFormat (in ragbits.agents.history)
  • ChatInterface.create_pending_confirmation_state(request) -> dict[str, Any]
  • ChatInterface.resolve_pending_confirmations(agent, context) -> list[ChatResponseUnion] — mutates agent.history in place
  • ConfirmationRequest.tool_call_id field (new, required — only constructed internally by HookManager)

Example

examples/chat/file_explorer_agent.py updated to demonstrate the new pattern using a single Agent instance.

Test plan

  • Unit tests for Agent.execute_tool_directly covering happy path, deny-by-hook, unknown tool, POST_TOOL hook behavior, and prior-confirmation match
  • Unit tests for inject_tool_call covering shape, immutability, and result coercion
  • Unit tests for ChatInterface.create_pending_confirmation_state and resolve_pending_confirmations covering confirmed, declined, no-pending, and unknown-id paths; verify agent.history is mutated in place
  • Regression tests pass for existing HookManager.execute_pre_tool with both new and legacy match paths (260 tests green across ragbits-agents and ragbits-chat)
  • Manual end-to-end test in file_explorer_agent example (reviewer should verify no loop, tool runs once with original args)

Follow-ups (not in this PR)

  • Remove the tool_name fallback from HookManager._find_confirmation once internal consumers migrate to the direct-execution pattern.
  • Consider argument canonicalization (recursive sort_keys, numeric normalization) as defense-in-depth for any remaining hash-matched flows.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 19, 2026

badge

Code Coverage Summary

Details
Filename                                                                                                         Stmts    Miss  Cover    Missing
-------------------------------------------------------------------------------------------------------------  -------  ------  -------  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
packages/ragbits-agents/src/ragbits/agents/__init__.py                                                               5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/_main.py                                                                578     126  78.20%   58-60, 155-159, 163, 167-169, 172-173, 176-180, 183-184, 225, 315-318, 333-339, 415, 417, 421, 545, 569, 705, 821, 850, 876, 889-907, 921-925, 930, 932, 934, 949-950, 962-967, 1017, 1021, 1045-1053, 1055, 1057-1058, 1060-1061, 1113, 1175, 1181, 1184, 1190-1195, 1246, 1261-1262, 1283-1316, 1335-1368, 1386, 1392, 1397, 1437, 1440, 1444
packages/ragbits-agents/src/ragbits/agents/cli.py                                                                  197     179  9.14%    32-70, 83-106, 117-409, 427-437, 450-483
packages/ragbits-agents/src/ragbits/agents/confirmation.py                                                          13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/exceptions.py                                                            48      16  66.67%   40-42, 51-52, 86-91, 100-103, 114-122
packages/ragbits-agents/src/ragbits/agents/history.py                                                                5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tool.py                                                                  90      32  64.44%   15, 124, 149-193
packages/ragbits-agents/src/ragbits/agents/types.py                                                                 15       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/__init__.py                                                         5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/base.py                                                            13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/confirmation.py                                                     7       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/manager.py                                                         96       2  97.92%   30, 186
packages/ragbits-agents/src/ragbits/agents/hooks/types.py                                                           23       1  95.65%   19
packages/ragbits-agents/src/ragbits/agents/mcp/__init__.py                                                           2       0  100.00%
packages/ragbits-agents/src/ragbits/agents/mcp/server.py                                                           143      14  90.21%   174, 183-184, 281, 332-335, 349, 361, 417-420, 434, 447
packages/ragbits-agents/src/ragbits/agents/mcp/utils.py                                                             13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/__init__.py                                                         4       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/memory.py                                                          66       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/openai.py                                                          47      10  78.72%   27-29, 46-50, 67-69, 92
packages/ragbits-agents/src/ragbits/agents/tools/planning.py                                                       100      64  36.00%   37-38, 43, 48, 53, 58-59, 70, 75, 79-83, 87, 101-238
packages/ragbits-agents/src/ragbits/agents/tools/types.py                                                            6       0  100.00%
packages/ragbits-agents/tests/__init__.py                                                                            0       0  100.00%
packages/ragbits-agents/tests/unit/__init__.py                                                                       0       0  100.00%
packages/ragbits-agents/tests/unit/conftest.py                                                                      75       1  98.67%   26
packages/ragbits-agents/tests/unit/test_agent.py                                                                   516       1  99.81%   1085
packages/ragbits-agents/tests/unit/test_history.py                                                                  20       0  100.00%
packages/ragbits-agents/tests/unit/hooks/__init__.py                                                                 0       0  100.00%
packages/ragbits-agents/tests/unit/hooks/conftest.py                                                                 5       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_base.py                                                               23       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_confirmation.py                                                       22       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_manager.py                                                           268      11  95.90%   98-99, 342, 367, 372, 393-394, 429, 443, 448, 467
packages/ragbits-agents/tests/unit/mcp/helpers.py                                                                   36       3  91.67%   21, 26, 61
packages/ragbits-agents/tests/unit/mcp/test_caching.py                                                              21       0  100.00%
packages/ragbits-agents/tests/unit/mcp/test_connect_disconnect.py                                                   28       0  100.00%
packages/ragbits-agents/tests/unit/mcp/test_exceptions.py                                                           25       1  96.00%   20
packages/ragbits-agents/tests/unit/mcp/test_mcp_utils.py                                                            41       0  100.00%
packages/ragbits-agents/tests/unit/tools/test_memory.py                                                             94       0  100.00%
packages/ragbits-agents/tests/unit/tools/test_openai.py                                                             65       0  100.00%
packages/ragbits-chat/src/ragbits/chat/__init__.py                                                                   4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/_utils.py                                                                    23       5  78.26%   17, 32, 38-40
packages/ragbits-chat/src/ragbits/chat/api.py                                                                      470     199  57.66%   88-103, 126, 156, 158, 162-183, 214-222, 234-236, 288-302, 359-367, 402, 408, 454-473, 493-501, 542-561, 581, 584-586, 599, 609, 646, 653-658, 663, 700-702, 714, 722-724, 734-736, 757-800, 822, 830-833, 866-881, 923-943, 1013-1066, 1071-1083, 1086-1100, 1103-1122
packages/ragbits-chat/src/ragbits/chat/cli.py                                                                       11       4  63.64%   45-65
packages/ragbits-chat/src/ragbits/chat/metrics.py                                                                   44       0  100.00%
packages/ragbits-chat/src/ragbits/chat/auth/__init__.py                                                              4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/auth/backends.py                                                            211     135  36.02%   191-222, 231-244, 256-270, 297-312, 324-353, 365-414, 429, 444-455, 467-476, 499-506, 510, 514, 526-544, 556-570, 583-589, 601-611
packages/ragbits-chat/src/ragbits/chat/auth/base.py                                                                 30       4  86.67%   46, 59, 72, 85
packages/ragbits-chat/src/ragbits/chat/auth/oauth2_providers.py                                                     35       7  80.00%   35, 40, 45, 50, 55, 60, 72
packages/ragbits-chat/src/ragbits/chat/auth/provider_config.py                                                      18       6  66.67%   38, 69-75
packages/ragbits-chat/src/ragbits/chat/auth/session_store.py                                                        55       5  90.91%   161-162, 176-178
packages/ragbits-chat/src/ragbits/chat/auth/types.py                                                                43       3  93.02%   84, 97, 110
packages/ragbits-chat/src/ragbits/chat/client/__init__.py                                                            4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/client/client.py                                                             46      21  54.35%   29-30, 34, 38, 47-48, 57-59, 63, 72, 90-91, 95, 99, 108-109, 118-119, 123, 132
packages/ragbits-chat/src/ragbits/chat/client/conversation.py                                                      136      13  90.44%   65, 67, 69, 83-84, 92, 95, 98-99, 121, 200, 203, 230
packages/ragbits-chat/src/ragbits/chat/client/exceptions.py                                                          4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/__init__.py                                                           0       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/__init__.py                                               3       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/base.py                                                  10       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/llm.py                                                   29       1  96.55%   79
packages/ragbits-chat/src/ragbits/chat/interface/__init__.py                                                         2       0  100.00%
packages/ragbits-chat/src/ragbits/chat/interface/_interface.py                                                     165      17  89.70%   118-119, 166-175, 189, 254-255, 271, 276, 281, 285, 293, 357, 445, 483-484
packages/ragbits-chat/src/ragbits/chat/interface/forms.py                                                           50      13  74.00%   59, 64, 79-98, 117
packages/ragbits-chat/src/ragbits/chat/interface/summary.py                                                         50      23  54.00%   18, 35, 50-52, 56-66, 73-74, 78-82
packages/ragbits-chat/src/ragbits/chat/interface/types.py                                                          267      57  78.65%   124, 134, 148, 171, 223, 232, 239, 248, 257, 464-473, 495-504, 526-535, 547-556, 578-587, 609-618, 630-639, 651-660, 672-681, 693-702, 714-724, 736-745
packages/ragbits-chat/src/ragbits/chat/interface/ui_customization.py                                                21       0  100.00%
packages/ragbits-chat/src/ragbits/chat/persistence/__init__.py                                                       2       0  100.00%
packages/ragbits-chat/src/ragbits/chat/persistence/base.py                                                           7       1  85.71%   29
packages/ragbits-chat/src/ragbits/chat/persistence/sql.py                                                           93       3  96.77%   296-298
packages/ragbits-chat/tests/unit/test_api.py                                                                       232       1  99.57%   268
packages/ragbits-chat/tests/unit/test_chat_client.py                                                               105       2  98.10%   67, 87
packages/ragbits-chat/tests/unit/test_confirmation_resolution.py                                                    62       1  98.39%   19
packages/ragbits-chat/tests/unit/test_conversation.py                                                              122       1  99.18%   64
packages/ragbits-chat/tests/unit/test_error_response.py                                                             52       0  100.00%
packages/ragbits-chat/tests/unit/test_generic_custom_response.py                                                   192       6  96.88%   132, 165, 184, 199, 227, 252
packages/ragbits-chat/tests/unit/test_types.py                                                                      14       0  100.00%
packages/ragbits-chat/tests/unit/test_upload.py                                                                     33       1  96.97%   18
packages/ragbits-chat/tests/unit/auth/test_list_auth_backend.py                                                    251       0  100.00%
packages/ragbits-chat/tests/unit/auth/test_session_store.py                                                         94       0  100.00%
packages/ragbits-chat/tests/unit/history/test_llm_compressor.py                                                     64       0  100.00%
packages/ragbits-chat/tests/unit/persistence/test_sql.py                                                            74       0  100.00%
packages/ragbits-cli/src/ragbits/cli/__init__.py                                                                    35       4  88.57%   80-81, 88-89
packages/ragbits-cli/src/ragbits/cli/_utils.py                                                                      23       4  82.61%   47, 65-67
packages/ragbits-cli/src/ragbits/cli/state.py                                                                       79       3  96.20%   50-51, 61
packages/ragbits-cli/tests/unit/test_state.py                                                                       72       2  97.22%   103-104
packages/ragbits-cli/tests/unit/test_utils.py                                                                       23       0  100.00%
packages/ragbits-core/src/ragbits/core/__init__.py                                                                  16       4  75.00%   20-21, 25-26
packages/ragbits-core/src/ragbits/core/cli.py                                                                        6       0  100.00%
packages/ragbits-core/src/ragbits/core/options.py                                                                   17       0  100.00%
packages/ragbits-core/src/ragbits/core/types.py                                                                      9       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/__init__.py                                                             5       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/metrics/__init__.py                                                    30      14  53.33%   39-56, 64
packages/ragbits-core/src/ragbits/core/audit/metrics/base.py                                                        49       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/traces/__init__.py                                                     80       9  88.75%   49-52, 55-58, 66-69
packages/ragbits-core/src/ragbits/core/audit/traces/base.py                                                        187      60  67.91%   156-165, 178-179, 200, 215-216, 220, 230, 233-234, 249, 256, 258-260, 266-268, 275-278, 338-349, 356-364, 378, 394-413
packages/ragbits-core/src/ragbits/core/audit/traces/cli.py                                                         133      29  78.20%   89-94, 113-140, 157, 164, 173-174, 177-178
packages/ragbits-core/src/ragbits/core/embeddings/__init__.py                                                        4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/base.py                                                           32       5  84.38%   20-21, 24, 77, 92
packages/ragbits-core/src/ragbits/core/embeddings/exceptions.py                                                     17       7  58.82%   7-8, 17, 26-27, 36, 45
packages/ragbits-core/src/ragbits/core/embeddings/dense/__init__.py                                                  4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/dense/base.py                                                      9       1  88.89%   44
packages/ragbits-core/src/ragbits/core/embeddings/dense/fastembed.py                                                35       3  91.43%   34, 62-63
packages/ragbits-core/src/ragbits/core/embeddings/dense/litellm.py                                                  58      12  79.31%   19, 134-139, 142, 146-148, 169
packages/ragbits-core/src/ragbits/core/embeddings/dense/local.py                                                    32       5  84.38%   13-14, 52, 68-69
packages/ragbits-core/src/ragbits/core/embeddings/dense/noop.py                                                     32       1  96.88%   99
packages/ragbits-core/src/ragbits/core/embeddings/dense/vertex_multimodal.py                                        60      24  60.00%   13-14, 57, 62, 102-123, 139-148, 175, 194-198
packages/ragbits-core/src/ragbits/core/embeddings/sparse/__init__.py                                                 4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/sparse/bag_of_tokens.py                                           43       1  97.67%   53
packages/ragbits-core/src/ragbits/core/embeddings/sparse/base.py                                                    12       1  91.67%   48
packages/ragbits-core/src/ragbits/core/embeddings/sparse/fastembed.py                                               31       2  93.55%   25, 52
packages/ragbits-core/src/ragbits/core/llms/__init__.py                                                              4       0  100.00%
packages/ragbits-core/src/ragbits/core/llms/base.py                                                                261      23  91.19%   163-170, 173-181, 188-192, 251, 253-256, 287, 318, 499
packages/ragbits-core/src/ragbits/core/llms/exceptions.py                                                           29       6  79.31%   17, 26-27, 36, 45, 63
packages/ragbits-core/src/ragbits/core/llms/factory.py                                                              12       2  83.33%   30, 51
packages/ragbits-core/src/ragbits/core/llms/litellm.py                                                             242     108  55.37%   28, 141, 159-160, 197, 226, 248, 292-407, 440, 468, 472-477, 506-515, 525-573, 583, 612
packages/ragbits-core/src/ragbits/core/llms/local.py                                                               111      37  66.67%   14, 69, 79-80, 94-95, 101, 107, 119-120, 212-279, 294-295
packages/ragbits-core/src/ragbits/core/llms/mock.py                                                                 50       2  96.00%   126, 130
packages/ragbits-core/src/ragbits/core/prompt/__init__.py                                                            2       0  100.00%
packages/ragbits-core/src/ragbits/core/prompt/_cli.py                                                               53      22  58.49%   37-45, 59-61, 69-80, 88-90, 102-110
packages/ragbits-core/src/ragbits/core/prompt/base.py                                                               45       1  97.78%   26
packages/ragbits-core/src/ragbits/core/prompt/discovery.py                                                          36       2  94.44%   55-56
packages/ragbits-core/src/ragbits/core/prompt/exceptions.py                                                         13       1  92.31%   17
packages/ragbits-core/src/ragbits/core/prompt/parsers.py                                                            35       0  100.00%
packages/ragbits-core/src/ragbits/core/prompt/prompt.py                                                            189       7  96.30%   105-107, 178, 181, 257, 361
packages/ragbits-core/src/ragbits/core/sources/__init__.py                                                          10       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/azure.py                                                             95      13  86.32%   65-66, 92-102, 189-190
packages/ragbits-core/src/ragbits/core/sources/base.py                                                              74       3  95.95%   46, 185-186
packages/ragbits-core/src/ragbits/core/sources/exceptions.py                                                        16       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/gcs.py                                                               63       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/git.py                                                               94       3  96.81%   188, 195, 211
packages/ragbits-core/src/ragbits/core/sources/google_drive.py                                                     285     169  40.70%   109-112, 128-143, 157-180, 187, 198-217, 263, 276-282, 298-333, 346-406, 416-473, 490-509, 536, 539-542, 545-552, 555-558, 575-576, 583-585, 589-593
packages/ragbits-core/src/ragbits/core/sources/hf.py                                                                72      17  76.39%   55-58, 62-63, 83-86, 106-110, 138, 145-146
packages/ragbits-core/src/ragbits/core/sources/local.py                                                             41       2  95.12%   39, 80
packages/ragbits-core/src/ragbits/core/sources/s3.py                                                               105      17  83.81%   54-57, 75, 88-93, 117, 128-131, 162, 179
packages/ragbits-core/src/ragbits/core/sources/web.py                                                               41       1  97.56%   75
packages/ragbits-core/src/ragbits/core/utils/__init__.py                                                             2       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/_pyproject.py                                                          38       1  97.37%   113
packages/ragbits-core/src/ragbits/core/utils/config_handling.py                                                     79       9  88.61%   17, 55-56, 63-64, 133, 163-165
packages/ragbits-core/src/ragbits/core/utils/decorators.py                                                          29       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/dict_transformations.py                                               143      35  75.52%   24, 27, 80, 90, 110-115, 126-133, 147-151, 166-167, 173, 185-191, 195, 254
packages/ragbits-core/src/ragbits/core/utils/function_schema.py                                                     90      19  78.89%   105, 113-127, 134-147, 160, 205, 210, 213-215
packages/ragbits-core/src/ragbits/core/utils/helpers.py                                                             11       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/lazy_litellm.py                                                        30       1  96.67%   38
packages/ragbits-core/src/ragbits/core/utils/pydantic.py                                                            13       2  84.62%   13, 16
packages/ragbits-core/src/ragbits/core/utils/secrets.py                                                             18       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/__init__.py                                                     3       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/_cli.py                                                        50       4  92.00%   67, 89, 95, 119
packages/ragbits-core/src/ragbits/core/vector_stores/base.py                                                       103       3  97.09%   53, 214, 286
packages/ragbits-core/src/ragbits/core/vector_stores/chroma.py                                                      91       2  97.80%   74, 112
packages/ragbits-core/src/ragbits/core/vector_stores/hybrid.py                                                      34       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/hybrid_strategies.py                                           65       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/in_memory.py                                                   59       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/pgvector.py                                                   190      15  92.11%   97, 106-109, 125, 168, 312-313, 338-340, 373-375
packages/ragbits-core/src/ragbits/core/vector_stores/qdrant.py                                                      97       5  94.85%   80-95, 160, 181
packages/ragbits-core/src/ragbits/core/vector_stores/weaviate.py                                                   127       5  96.06%   104-132, 271
packages/ragbits-core/tests/conftest.py                                                                             12       0  100.00%
packages/ragbits-core/tests/cli/__init__.py                                                                          0       0  100.00%
packages/ragbits-core/tests/cli/test_cli_trace_handler.py                                                           47       3  93.62%   29, 42, 55
packages/ragbits-core/tests/cli/test_vector_store.py                                                               115       0  100.00%
packages/ragbits-core/tests/integration/sources/test_git.py                                                         68       6  91.18%   147-156
packages/ragbits-core/tests/integration/sources/test_hf.py                                                          19       9  52.63%   16-21, 32-37
packages/ragbits-core/tests/integration/sources/test_s3.py                                                          42       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/__init__.py                                                    0       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/test_keyword_search.py                                        79       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/test_vector_store.py                                         140       1  99.29%   51
packages/ragbits-core/tests/integration/vector_stores/test_vector_store_sparse.py                                   63       0  100.00%
packages/ragbits-core/tests/unit/__init__.py                                                                         0       0  100.00%
packages/ragbits-core/tests/unit/test_options.py                                                                    21       0  100.00%
packages/ragbits-core/tests/unit/audit/test_cli.py                                                                 107       0  100.00%
packages/ragbits-core/tests/unit/audit/test_metrics.py                                                              35       7  80.00%   14-19, 23
packages/ragbits-core/tests/unit/audit/test_trace.py                                                                98       3  96.94%   17, 20, 23
packages/ragbits-core/tests/unit/embeddings/test_bag_of_tokens.py                                                   52       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_fastembed.py                                                       50       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_from_config.py                                                     39       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_litellm.py                                                         84       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_local.py                                                           42       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_noop.py                                                            26       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_vector_size.py                                                     33       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_vertex_multimodal.py                                               39       0  100.00%
packages/ragbits-core/tests/unit/llms/__init__.py                                                                    0       0  100.00%
packages/ragbits-core/tests/unit/llms/test_base.py                                                                 196       3  98.47%   77-80
packages/ragbits-core/tests/unit/llms/test_from_config.py                                                           27       0  100.00%
packages/ragbits-core/tests/unit/llms/test_litellm.py                                                              216       3  98.61%   170-173
packages/ragbits-core/tests/unit/llms/test_local.py                                                                 74       0  100.00%
packages/ragbits-core/tests/unit/llms/factory/__init__.py                                                            0       0  100.00%
packages/ragbits-core/tests/unit/llms/factory/test_get_preferred_llm.py                                             12       0  100.00%
packages/ragbits-core/tests/unit/prompts/__init__.py                                                                 0       0  100.00%
packages/ragbits-core/tests/unit/prompts/test_parsers.py                                                            65       0  100.00%
packages/ragbits-core/tests/unit/prompts/test_prompt.py                                                            334       1  99.70%   777
packages/ragbits-core/tests/unit/prompts/discovery/__init__.py                                                       0       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/prompt_classes_for_tests.py                                      30       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/test_prompt_discovery.py                                         18       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/__init__.py                        2       1  50.00%   3
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/__init__.py                3       2  33.33%   2-4
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/temp_prompt1.py           14       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/temp_prompt2.py           14       0  100.00%
packages/ragbits-core/tests/unit/sources/test_aws.py                                                                23       0  100.00%
packages/ragbits-core/tests/unit/sources/test_azure.py                                                              70       0  100.00%
packages/ragbits-core/tests/unit/sources/test_exceptions.py                                                         22       0  100.00%
packages/ragbits-core/tests/unit/sources/test_gcs.py                                                                33       6  81.82%   42-47
packages/ragbits-core/tests/unit/sources/test_git.py                                                               110       0  100.00%
packages/ragbits-core/tests/unit/sources/test_google_drive.py                                                      135      50  62.96%   27-32, 50, 64-102, 187-227
packages/ragbits-core/tests/unit/sources/test_hf.py                                                                 12       0  100.00%
packages/ragbits-core/tests/unit/sources/test_local.py                                                              13       0  100.00%
packages/ragbits-core/tests/unit/sources/test_source_discriminator.py                                               36       0  100.00%
packages/ragbits-core/tests/unit/sources/test_web.py                                                                43       0  100.00%
packages/ragbits-core/tests/unit/utils/__init__.py                                                                   0       0  100.00%
packages/ragbits-core/tests/unit/utils/test_config_handling.py                                                      76       2  97.37%   27-28
packages/ragbits-core/tests/unit/utils/test_decorators.py                                                           26       2  92.31%   17, 39
packages/ragbits-core/tests/unit/utils/test_dict_transformations.py                                                 98       0  100.00%
packages/ragbits-core/tests/unit/utils/test_function_schema.py                                                      16       2  87.50%   19, 32
packages/ragbits-core/tests/unit/utils/test_helpers.py                                                               6       0  100.00%
packages/ragbits-core/tests/unit/utils/test_secrets.py                                                              24       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_find.py                                                       13       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_get_config.py                                                  9       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_get_instace.py                                                37       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_base.py                                                          6       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_chroma.py                                                       81       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_from_config.py                                                  55       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_hybrid.py                                                       74       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_hybrid_strategies.py                                            31       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_in_memory.py                                                   102       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_pgvector.py                                                    262       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_qdrant.py                                                      100       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_weaviate.py                                                    142       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/__init__.py                                             2       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/_main.py                                               91       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/cli.py                                                 40       2  95.00%   86, 105
packages/ragbits-document-search/src/ragbits/document_search/documents/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/documents/document.py                                  78       2  97.44%   49, 93
packages/ragbits-document-search/src/ragbits/document_search/documents/element.py                                   86      14  83.72%   97, 115, 179-187, 197, 206-208
packages/ragbits-document-search/src/ragbits/document_search/ingestion/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/__init__.py                         4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/base.py                            21       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/exceptions.py                      14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/image.py                           30       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/router.py                          25       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/__init__.py                           3       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/base.py                              28       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/docling.py                           48       4  91.67%   12-13, 100, 161
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/exceptions.py                        14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/router.py                            27       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/unstructured.py                      66      24  63.64%   102, 121-123, 135-156, 176-190, 212-213, 233-248
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/__init__.py                      8       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/callbacks.py                    10       1  90.00%   32
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/exceptions.py                   16      10  37.50%   25-33, 49-52
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/hyperlink_callback.py           38      12  68.42%   44-69, 72, 81, 84
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/metadata_callback.py            29       9  68.97%   52-71, 74
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/parser.py                       43       6  86.05%   60-62, 71-73
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/speaker_notes_callback.py       31      13  58.06%   41-68, 71
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/__init__.py                        5       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/base.py                          102      21  79.41%   152-156, 212-242, 284
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/batched.py                        69       8  88.41%   172, 200-215, 255-256
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/ray.py                            32       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/sequential.py                      4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/__init__.py                        4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/base.py                           14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/llm.py                            40       5  87.50%   51, 115-118
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/noop.py                            8       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/__init__.py                         3       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/answerai.py                        29       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/base.py                            19       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/litellm.py                         27       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/llm.py                             59       1  98.31%   173
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/noop.py                            10       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/rrf.py                             28       2  92.86%   50, 60
packages/ragbits-document-search/tests/cli/custom_cli_source.py                                                     22       1  95.45%   32
packages/ragbits-document-search/tests/cli/test_ingest.py                                                           56       0  100.00%
packages/ragbits-document-search/tests/cli/test_search.py                                                           71       0  100.00%
packages/ragbits-document-search/tests/integration/__init__.py                                                       0       0  100.00%
packages/ragbits-document-search/tests/integration/test_docling.py                                                  10       0  100.00%
packages/ragbits-document-search/tests/integration/test_pptx_parser.py                                              54       9  83.33%   32-34, 52, 71, 74-75, 78-79
packages/ragbits-document-search/tests/integration/test_rerankers.py                                                32       9  71.88%   32-39, 59-64
packages/ragbits-document-search/tests/integration/test_unstructured.py                                             12       4  66.67%   62-67
packages/ragbits-document-search/tests/unit/test_config.py                                                          63       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_parser_router.py                                          24       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_parsers.py                                                47       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_search.py                                                238       1  99.58%   480
packages/ragbits-document-search/tests/unit/test_document_search_ingest_errors.py                                   38       0  100.00%
packages/ragbits-document-search/tests/unit/test_documents.py                                                       13       0  100.00%
packages/ragbits-document-search/tests/unit/test_element_enricher_router.py                                         23       0  100.00%
packages/ragbits-document-search/tests/unit/test_element_enrichers.py                                               56       1  98.21%   25
packages/ragbits-document-search/tests/unit/test_elements.py                                                        21       0  100.00%
packages/ragbits-document-search/tests/unit/test_ingest_strategies.py                                               43       0  100.00%
packages/ragbits-document-search/tests/unit/test_llm_reranker.py                                                    43       0  100.00%
packages/ragbits-document-search/tests/unit/test_rephrasers.py                                                      26       0  100.00%
packages/ragbits-document-search/tests/unit/test_rerankers.py                                                       80       1  98.75%   25
packages/ragbits-document-search/tests/unit/testprojects/project_with_instance_factory/__init__.py                   0       0  100.00%
packages/ragbits-document-search/tests/unit/testprojects/project_with_instance_factory/factories.py                 22       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/__init__.py                                                           0       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/cli.py                                                               46       3  93.48%   133, 135, 137
packages/ragbits-evaluate/src/ragbits/evaluate/evaluator.py                                                         92       1  98.91%   221
packages/ragbits-evaluate/src/ragbits/evaluate/optimizer.py                                                         92      18  80.43%   162-168, 187, 190-191, 194, 198-204, 207-210
packages/ragbits-evaluate/src/ragbits/evaluate/utils.py                                                             58      37  36.21%   31-50, 62-69, 98-101, 117-129, 140-149, 159-160
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/__init__.py                                               2       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/base.py                                                  34       4  88.24%   58-60, 79
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/document_search.py                                       13       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/exceptions.py                                            10       5  50.00%   10-12, 21-25
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/__init__.py                                                   2       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/base.py                                                      27       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/document_search.py                                           23       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/__init__.py                                                17       3  82.35%   19-20, 36
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/base.py                                                    24       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/document_search.py                                         38       6  84.21%   68-71, 80-84
packages/ragbits-evaluate/tests/cli/test_run_evaluation.py                                                          25       0  100.00%
packages/ragbits-evaluate/tests/unit/test_evaluator.py                                                             103       0  100.00%
packages/ragbits-evaluate/tests/unit/test_metrics.py                                                                77       0  100.00%
packages/ragbits-evaluate/tests/unit/test_optimizer.py                                                              68       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/__init__.py                                                       0       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/base.py                                                          15       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/openai_moderation.py                                             19       6  68.42%   27-34
packages/ragbits-guardrails/tests/unit/test_openai_moderation.py                                                    35       0  100.00%
TOTAL                                                                                                            17124    2062  87.96%

Diff against main

Filename                                                                                     Stmts    Miss  Cover
-----------------------------------------------------------------------------------------  -------  ------  --------
packages/ragbits-agents/src/ragbits/agents/__init__.py                                          +1       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/_main.py                                            +11      +1  +0.25%
packages/ragbits-agents/src/ragbits/agents/confirmation.py                                      +2       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/history.py                                           +5       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/manager.py                                    +10       0  +0.25%
packages/ragbits-agents/src/ragbits/agents/tools/__init__.py                                    +2       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/tools/memory.py                                     +66       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/tools/planning.py                                  +100     +64  +36.00%
packages/ragbits-agents/tests/unit/test_agent.py                                               +30       0  +0.02%
packages/ragbits-agents/tests/unit/test_history.py                                             +20       0  +100.00%
packages/ragbits-agents/tests/unit/hooks/test_manager.py                                       +29       0  +0.50%
packages/ragbits-agents/tests/unit/tools/test_memory.py                                        +94       0  +100.00%
packages/ragbits-chat/src/ragbits/chat/interface/_interface.py                                 +33      +1  +1.82%
packages/ragbits-chat/src/ragbits/chat/interface/types.py                                      +12      +5  -0.96%
packages/ragbits-chat/src/ragbits/chat/interface/ui_customization.py                            +2       0  +100.00%
packages/ragbits-chat/tests/unit/test_confirmation_resolution.py                               +62      +1  +98.39%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/docling.py        0      +2  -4.16%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/__init__.py                            +6      +2  -8.56%
TOTAL                                                                                         +485     +76  -0.10%

Results for commit: 1b3ae55

Minimum allowed coverage is 60%

♻️ This comment has been updated with latest results

@mkoruszowic mkoruszowic marked this pull request as draft April 19, 2026 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: agent hooks regarding tool confirmation

1 participant