fix: resolve confirmed tools by direct execution to prevent LLM argument drift by mkoruszowic · Pull Request #972 · deepsense-ai/ragbits

mkoruszowic · 2026-04-19T15:56:14Z

Summary

Closes #969.

The existing confirmation flow asks the LLM to regenerate the tool call on the continuation turn after a user approves it. Because the LLM often produces semantically-identical but textually-different arguments (reordered nested keys, whitespace, rephrased free-text, 1.0 vs 1), the freshly computed confirmation_id hash no longer matches the one the user approved — causing an infinite confirmation loop or a silent termination.

This PR adds two complementary fixes:

1. Tool-name fallback in HookManager (short-term band-aid) — when the exact confirmation_id match fails, fall back to matching by tool_name. Keeps existing flows unblocked while the proper fix rolls out. Will be removed in a follow-up PR once consumers adopt (2).

2. Direct tool execution from the chat layer (proper fix) — instead of the LLM re-emitting the tool call, the chat layer:

Persists the pending confirmation (tool_name, arguments, tool_call_id) in ChatContext.state["pending_confirmations"] when it streams a ConfirmationRequest.
On the continuation turn, calls the new Agent.execute_tool_directly(...) with the stored original arguments, bypassing LLM regeneration entirely.
Injects a synthetic (assistant tool_use, tool result) pair into the same agent's history (mutated in place by resolve_pending_confirmations — no second agent, no extra run), so the LLM continues from the executed result without ever re-deciding the call.
For declined confirmations, injects a "❌ User declined this action" tool result so the LLM can respond appropriately.

Because we replay the original arguments verbatim, the PRE_TOOL confirmation hook's confirmation_id hash matches by construction — no special bypass logic needed. Non-confirmation PRE_TOOL hooks (validation, access control) still run as before, and a deny decision still blocks execution.

Chat-layer usage (one agent, minimal glue)

agent = Agent(llm=..., tools=..., hooks=..., history=history)

# Mutates agent.history in place with (tool_use, tool_result) pairs for any
# confirmed/declined entries from the previous turn. Returns UI responses to yield.
for response in await self.resolve_pending_confirmations(agent, context):
    yield response

# Same agent, same run — LLM continues from the injected tool results.
async for response in agent.run_streaming(message, context=agent_context):
    ...
    case ConfirmationRequest():
        # Persist pending so the next turn can resolve it directly.
        yield self.create_state_update(self.create_pending_confirmation_state(response))
        yield ConfirmationRequestResponse(content=...)

New public API

Agent.execute_tool_directly(tool_call_id, tool_name, arguments, context) -> ToolCallResult
inject_tool_call(history, tool_call_id, tool_name, arguments, result) -> ChatFormat (in ragbits.agents.history)
ChatInterface.create_pending_confirmation_state(request) -> dict[str, Any]
ChatInterface.resolve_pending_confirmations(agent, context) -> list[ChatResponseUnion] — mutates agent.history in place
ConfirmationRequest.tool_call_id field (new, required — only constructed internally by HookManager)

Example

examples/chat/file_explorer_agent.py updated to demonstrate the new pattern using a single Agent instance.

Test plan

Unit tests for Agent.execute_tool_directly covering happy path, deny-by-hook, unknown tool, POST_TOOL hook behavior, and prior-confirmation match
Unit tests for inject_tool_call covering shape, immutability, and result coercion
Unit tests for ChatInterface.create_pending_confirmation_state and resolve_pending_confirmations covering confirmed, declined, no-pending, and unknown-id paths; verify agent.history is mutated in place
Regression tests pass for existing HookManager.execute_pre_tool with both new and legacy match paths (260 tests green across ragbits-agents and ragbits-chat)
Manual end-to-end test in file_explorer_agent example (reviewer should verify no loop, tool runs once with original args)

Follow-ups (not in this PR)

Remove the tool_name fallback from HookManager._find_confirmation once internal consumers migrate to the direct-execution pattern.
Consider argument canonicalization (recursive sort_keys, numeric normalization) as defense-in-depth for any remaining hash-matched flows.

…me fallback

…tion resume

…ory injection

…s via direct tool execution

…ia direct execution

github-actions · 2026-04-19T16:02:34Z

Code Coverage Summary

Details

Filename                                                                                                         Stmts    Miss  Cover    Missing
-------------------------------------------------------------------------------------------------------------  -------  ------  -------  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
packages/ragbits-agents/src/ragbits/agents/__init__.py                                                               5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/_main.py                                                                578     126  78.20%   58-60, 155-159, 163, 167-169, 172-173, 176-180, 183-184, 225, 315-318, 333-339, 415, 417, 421, 545, 569, 705, 821, 850, 876, 889-907, 921-925, 930, 932, 934, 949-950, 962-967, 1017, 1021, 1045-1053, 1055, 1057-1058, 1060-1061, 1113, 1175, 1181, 1184, 1190-1195, 1246, 1261-1262, 1283-1316, 1335-1368, 1386, 1392, 1397, 1437, 1440, 1444
packages/ragbits-agents/src/ragbits/agents/cli.py                                                                  197     179  9.14%    32-70, 83-106, 117-409, 427-437, 450-483
packages/ragbits-agents/src/ragbits/agents/confirmation.py                                                          13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/exceptions.py                                                            48      16  66.67%   40-42, 51-52, 86-91, 100-103, 114-122
packages/ragbits-agents/src/ragbits/agents/history.py                                                                5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tool.py                                                                  90      32  64.44%   15, 124, 149-193
packages/ragbits-agents/src/ragbits/agents/types.py                                                                 15       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/__init__.py                                                         5       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/base.py                                                            13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/confirmation.py                                                     7       0  100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/manager.py                                                         96       2  97.92%   30, 186
packages/ragbits-agents/src/ragbits/agents/hooks/types.py                                                           23       1  95.65%   19
packages/ragbits-agents/src/ragbits/agents/mcp/__init__.py                                                           2       0  100.00%
packages/ragbits-agents/src/ragbits/agents/mcp/server.py                                                           143      14  90.21%   174, 183-184, 281, 332-335, 349, 361, 417-420, 434, 447
packages/ragbits-agents/src/ragbits/agents/mcp/utils.py                                                             13       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/__init__.py                                                         4       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/memory.py                                                          66       0  100.00%
packages/ragbits-agents/src/ragbits/agents/tools/openai.py                                                          47      10  78.72%   27-29, 46-50, 67-69, 92
packages/ragbits-agents/src/ragbits/agents/tools/planning.py                                                       100      64  36.00%   37-38, 43, 48, 53, 58-59, 70, 75, 79-83, 87, 101-238
packages/ragbits-agents/src/ragbits/agents/tools/types.py                                                            6       0  100.00%
packages/ragbits-agents/tests/__init__.py                                                                            0       0  100.00%
packages/ragbits-agents/tests/unit/__init__.py                                                                       0       0  100.00%
packages/ragbits-agents/tests/unit/conftest.py                                                                      75       1  98.67%   26
packages/ragbits-agents/tests/unit/test_agent.py                                                                   516       1  99.81%   1085
packages/ragbits-agents/tests/unit/test_history.py                                                                  20       0  100.00%
packages/ragbits-agents/tests/unit/hooks/__init__.py                                                                 0       0  100.00%
packages/ragbits-agents/tests/unit/hooks/conftest.py                                                                 5       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_base.py                                                               23       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_confirmation.py                                                       22       0  100.00%
packages/ragbits-agents/tests/unit/hooks/test_manager.py                                                           268      11  95.90%   98-99, 342, 367, 372, 393-394, 429, 443, 448, 467
packages/ragbits-agents/tests/unit/mcp/helpers.py                                                                   36       3  91.67%   21, 26, 61
packages/ragbits-agents/tests/unit/mcp/test_caching.py                                                              21       0  100.00%
packages/ragbits-agents/tests/unit/mcp/test_connect_disconnect.py                                                   28       0  100.00%
packages/ragbits-agents/tests/unit/mcp/test_exceptions.py                                                           25       1  96.00%   20
packages/ragbits-agents/tests/unit/mcp/test_mcp_utils.py                                                            41       0  100.00%
packages/ragbits-agents/tests/unit/tools/test_memory.py                                                             94       0  100.00%
packages/ragbits-agents/tests/unit/tools/test_openai.py                                                             65       0  100.00%
packages/ragbits-chat/src/ragbits/chat/__init__.py                                                                   4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/_utils.py                                                                    23       5  78.26%   17, 32, 38-40
packages/ragbits-chat/src/ragbits/chat/api.py                                                                      470     199  57.66%   88-103, 126, 156, 158, 162-183, 214-222, 234-236, 288-302, 359-367, 402, 408, 454-473, 493-501, 542-561, 581, 584-586, 599, 609, 646, 653-658, 663, 700-702, 714, 722-724, 734-736, 757-800, 822, 830-833, 866-881, 923-943, 1013-1066, 1071-1083, 1086-1100, 1103-1122
packages/ragbits-chat/src/ragbits/chat/cli.py                                                                       11       4  63.64%   45-65
packages/ragbits-chat/src/ragbits/chat/metrics.py                                                                   44       0  100.00%
packages/ragbits-chat/src/ragbits/chat/auth/__init__.py                                                              4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/auth/backends.py                                                            211     135  36.02%   191-222, 231-244, 256-270, 297-312, 324-353, 365-414, 429, 444-455, 467-476, 499-506, 510, 514, 526-544, 556-570, 583-589, 601-611
packages/ragbits-chat/src/ragbits/chat/auth/base.py                                                                 30       4  86.67%   46, 59, 72, 85
packages/ragbits-chat/src/ragbits/chat/auth/oauth2_providers.py                                                     35       7  80.00%   35, 40, 45, 50, 55, 60, 72
packages/ragbits-chat/src/ragbits/chat/auth/provider_config.py                                                      18       6  66.67%   38, 69-75
packages/ragbits-chat/src/ragbits/chat/auth/session_store.py                                                        55       5  90.91%   161-162, 176-178
packages/ragbits-chat/src/ragbits/chat/auth/types.py                                                                43       3  93.02%   84, 97, 110
packages/ragbits-chat/src/ragbits/chat/client/__init__.py                                                            4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/client/client.py                                                             46      21  54.35%   29-30, 34, 38, 47-48, 57-59, 63, 72, 90-91, 95, 99, 108-109, 118-119, 123, 132
packages/ragbits-chat/src/ragbits/chat/client/conversation.py                                                      136      13  90.44%   65, 67, 69, 83-84, 92, 95, 98-99, 121, 200, 203, 230
packages/ragbits-chat/src/ragbits/chat/client/exceptions.py                                                          4       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/__init__.py                                                           0       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/__init__.py                                               3       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/base.py                                                  10       0  100.00%
packages/ragbits-chat/src/ragbits/chat/history/compressors/llm.py                                                   29       1  96.55%   79
packages/ragbits-chat/src/ragbits/chat/interface/__init__.py                                                         2       0  100.00%
packages/ragbits-chat/src/ragbits/chat/interface/_interface.py                                                     165      17  89.70%   118-119, 166-175, 189, 254-255, 271, 276, 281, 285, 293, 357, 445, 483-484
packages/ragbits-chat/src/ragbits/chat/interface/forms.py                                                           50      13  74.00%   59, 64, 79-98, 117
packages/ragbits-chat/src/ragbits/chat/interface/summary.py                                                         50      23  54.00%   18, 35, 50-52, 56-66, 73-74, 78-82
packages/ragbits-chat/src/ragbits/chat/interface/types.py                                                          267      57  78.65%   124, 134, 148, 171, 223, 232, 239, 248, 257, 464-473, 495-504, 526-535, 547-556, 578-587, 609-618, 630-639, 651-660, 672-681, 693-702, 714-724, 736-745
packages/ragbits-chat/src/ragbits/chat/interface/ui_customization.py                                                21       0  100.00%
packages/ragbits-chat/src/ragbits/chat/persistence/__init__.py                                                       2       0  100.00%
packages/ragbits-chat/src/ragbits/chat/persistence/base.py                                                           7       1  85.71%   29
packages/ragbits-chat/src/ragbits/chat/persistence/sql.py                                                           93       3  96.77%   296-298
packages/ragbits-chat/tests/unit/test_api.py                                                                       232       1  99.57%   268
packages/ragbits-chat/tests/unit/test_chat_client.py                                                               105       2  98.10%   67, 87
packages/ragbits-chat/tests/unit/test_confirmation_resolution.py                                                    62       1  98.39%   19
packages/ragbits-chat/tests/unit/test_conversation.py                                                              122       1  99.18%   64
packages/ragbits-chat/tests/unit/test_error_response.py                                                             52       0  100.00%
packages/ragbits-chat/tests/unit/test_generic_custom_response.py                                                   192       6  96.88%   132, 165, 184, 199, 227, 252
packages/ragbits-chat/tests/unit/test_types.py                                                                      14       0  100.00%
packages/ragbits-chat/tests/unit/test_upload.py                                                                     33       1  96.97%   18
packages/ragbits-chat/tests/unit/auth/test_list_auth_backend.py                                                    251       0  100.00%
packages/ragbits-chat/tests/unit/auth/test_session_store.py                                                         94       0  100.00%
packages/ragbits-chat/tests/unit/history/test_llm_compressor.py                                                     64       0  100.00%
packages/ragbits-chat/tests/unit/persistence/test_sql.py                                                            74       0  100.00%
packages/ragbits-cli/src/ragbits/cli/__init__.py                                                                    35       4  88.57%   80-81, 88-89
packages/ragbits-cli/src/ragbits/cli/_utils.py                                                                      23       4  82.61%   47, 65-67
packages/ragbits-cli/src/ragbits/cli/state.py                                                                       79       3  96.20%   50-51, 61
packages/ragbits-cli/tests/unit/test_state.py                                                                       72       2  97.22%   103-104
packages/ragbits-cli/tests/unit/test_utils.py                                                                       23       0  100.00%
packages/ragbits-core/src/ragbits/core/__init__.py                                                                  16       4  75.00%   20-21, 25-26
packages/ragbits-core/src/ragbits/core/cli.py                                                                        6       0  100.00%
packages/ragbits-core/src/ragbits/core/options.py                                                                   17       0  100.00%
packages/ragbits-core/src/ragbits/core/types.py                                                                      9       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/__init__.py                                                             5       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/metrics/__init__.py                                                    30      14  53.33%   39-56, 64
packages/ragbits-core/src/ragbits/core/audit/metrics/base.py                                                        49       0  100.00%
packages/ragbits-core/src/ragbits/core/audit/traces/__init__.py                                                     80       9  88.75%   49-52, 55-58, 66-69
packages/ragbits-core/src/ragbits/core/audit/traces/base.py                                                        187      60  67.91%   156-165, 178-179, 200, 215-216, 220, 230, 233-234, 249, 256, 258-260, 266-268, 275-278, 338-349, 356-364, 378, 394-413
packages/ragbits-core/src/ragbits/core/audit/traces/cli.py                                                         133      29  78.20%   89-94, 113-140, 157, 164, 173-174, 177-178
packages/ragbits-core/src/ragbits/core/embeddings/__init__.py                                                        4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/base.py                                                           32       5  84.38%   20-21, 24, 77, 92
packages/ragbits-core/src/ragbits/core/embeddings/exceptions.py                                                     17       7  58.82%   7-8, 17, 26-27, 36, 45
packages/ragbits-core/src/ragbits/core/embeddings/dense/__init__.py                                                  4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/dense/base.py                                                      9       1  88.89%   44
packages/ragbits-core/src/ragbits/core/embeddings/dense/fastembed.py                                                35       3  91.43%   34, 62-63
packages/ragbits-core/src/ragbits/core/embeddings/dense/litellm.py                                                  58      12  79.31%   19, 134-139, 142, 146-148, 169
packages/ragbits-core/src/ragbits/core/embeddings/dense/local.py                                                    32       5  84.38%   13-14, 52, 68-69
packages/ragbits-core/src/ragbits/core/embeddings/dense/noop.py                                                     32       1  96.88%   99
packages/ragbits-core/src/ragbits/core/embeddings/dense/vertex_multimodal.py                                        60      24  60.00%   13-14, 57, 62, 102-123, 139-148, 175, 194-198
packages/ragbits-core/src/ragbits/core/embeddings/sparse/__init__.py                                                 4       0  100.00%
packages/ragbits-core/src/ragbits/core/embeddings/sparse/bag_of_tokens.py                                           43       1  97.67%   53
packages/ragbits-core/src/ragbits/core/embeddings/sparse/base.py                                                    12       1  91.67%   48
packages/ragbits-core/src/ragbits/core/embeddings/sparse/fastembed.py                                               31       2  93.55%   25, 52
packages/ragbits-core/src/ragbits/core/llms/__init__.py                                                              4       0  100.00%
packages/ragbits-core/src/ragbits/core/llms/base.py                                                                261      23  91.19%   163-170, 173-181, 188-192, 251, 253-256, 287, 318, 499
packages/ragbits-core/src/ragbits/core/llms/exceptions.py                                                           29       6  79.31%   17, 26-27, 36, 45, 63
packages/ragbits-core/src/ragbits/core/llms/factory.py                                                              12       2  83.33%   30, 51
packages/ragbits-core/src/ragbits/core/llms/litellm.py                                                             242     108  55.37%   28, 141, 159-160, 197, 226, 248, 292-407, 440, 468, 472-477, 506-515, 525-573, 583, 612
packages/ragbits-core/src/ragbits/core/llms/local.py                                                               111      37  66.67%   14, 69, 79-80, 94-95, 101, 107, 119-120, 212-279, 294-295
packages/ragbits-core/src/ragbits/core/llms/mock.py                                                                 50       2  96.00%   126, 130
packages/ragbits-core/src/ragbits/core/prompt/__init__.py                                                            2       0  100.00%
packages/ragbits-core/src/ragbits/core/prompt/_cli.py                                                               53      22  58.49%   37-45, 59-61, 69-80, 88-90, 102-110
packages/ragbits-core/src/ragbits/core/prompt/base.py                                                               45       1  97.78%   26
packages/ragbits-core/src/ragbits/core/prompt/discovery.py                                                          36       2  94.44%   55-56
packages/ragbits-core/src/ragbits/core/prompt/exceptions.py                                                         13       1  92.31%   17
packages/ragbits-core/src/ragbits/core/prompt/parsers.py                                                            35       0  100.00%
packages/ragbits-core/src/ragbits/core/prompt/prompt.py                                                            189       7  96.30%   105-107, 178, 181, 257, 361
packages/ragbits-core/src/ragbits/core/sources/__init__.py                                                          10       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/azure.py                                                             95      13  86.32%   65-66, 92-102, 189-190
packages/ragbits-core/src/ragbits/core/sources/base.py                                                              74       3  95.95%   46, 185-186
packages/ragbits-core/src/ragbits/core/sources/exceptions.py                                                        16       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/gcs.py                                                               63       0  100.00%
packages/ragbits-core/src/ragbits/core/sources/git.py                                                               94       3  96.81%   188, 195, 211
packages/ragbits-core/src/ragbits/core/sources/google_drive.py                                                     285     169  40.70%   109-112, 128-143, 157-180, 187, 198-217, 263, 276-282, 298-333, 346-406, 416-473, 490-509, 536, 539-542, 545-552, 555-558, 575-576, 583-585, 589-593
packages/ragbits-core/src/ragbits/core/sources/hf.py                                                                72      17  76.39%   55-58, 62-63, 83-86, 106-110, 138, 145-146
packages/ragbits-core/src/ragbits/core/sources/local.py                                                             41       2  95.12%   39, 80
packages/ragbits-core/src/ragbits/core/sources/s3.py                                                               105      17  83.81%   54-57, 75, 88-93, 117, 128-131, 162, 179
packages/ragbits-core/src/ragbits/core/sources/web.py                                                               41       1  97.56%   75
packages/ragbits-core/src/ragbits/core/utils/__init__.py                                                             2       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/_pyproject.py                                                          38       1  97.37%   113
packages/ragbits-core/src/ragbits/core/utils/config_handling.py                                                     79       9  88.61%   17, 55-56, 63-64, 133, 163-165
packages/ragbits-core/src/ragbits/core/utils/decorators.py                                                          29       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/dict_transformations.py                                               143      35  75.52%   24, 27, 80, 90, 110-115, 126-133, 147-151, 166-167, 173, 185-191, 195, 254
packages/ragbits-core/src/ragbits/core/utils/function_schema.py                                                     90      19  78.89%   105, 113-127, 134-147, 160, 205, 210, 213-215
packages/ragbits-core/src/ragbits/core/utils/helpers.py                                                             11       0  100.00%
packages/ragbits-core/src/ragbits/core/utils/lazy_litellm.py                                                        30       1  96.67%   38
packages/ragbits-core/src/ragbits/core/utils/pydantic.py                                                            13       2  84.62%   13, 16
packages/ragbits-core/src/ragbits/core/utils/secrets.py                                                             18       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/__init__.py                                                     3       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/_cli.py                                                        50       4  92.00%   67, 89, 95, 119
packages/ragbits-core/src/ragbits/core/vector_stores/base.py                                                       103       3  97.09%   53, 214, 286
packages/ragbits-core/src/ragbits/core/vector_stores/chroma.py                                                      91       2  97.80%   74, 112
packages/ragbits-core/src/ragbits/core/vector_stores/hybrid.py                                                      34       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/hybrid_strategies.py                                           65       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/in_memory.py                                                   59       0  100.00%
packages/ragbits-core/src/ragbits/core/vector_stores/pgvector.py                                                   190      15  92.11%   97, 106-109, 125, 168, 312-313, 338-340, 373-375
packages/ragbits-core/src/ragbits/core/vector_stores/qdrant.py                                                      97       5  94.85%   80-95, 160, 181
packages/ragbits-core/src/ragbits/core/vector_stores/weaviate.py                                                   127       5  96.06%   104-132, 271
packages/ragbits-core/tests/conftest.py                                                                             12       0  100.00%
packages/ragbits-core/tests/cli/__init__.py                                                                          0       0  100.00%
packages/ragbits-core/tests/cli/test_cli_trace_handler.py                                                           47       3  93.62%   29, 42, 55
packages/ragbits-core/tests/cli/test_vector_store.py                                                               115       0  100.00%
packages/ragbits-core/tests/integration/sources/test_git.py                                                         68       6  91.18%   147-156
packages/ragbits-core/tests/integration/sources/test_hf.py                                                          19       9  52.63%   16-21, 32-37
packages/ragbits-core/tests/integration/sources/test_s3.py                                                          42       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/__init__.py                                                    0       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/test_keyword_search.py                                        79       0  100.00%
packages/ragbits-core/tests/integration/vector_stores/test_vector_store.py                                         140       1  99.29%   51
packages/ragbits-core/tests/integration/vector_stores/test_vector_store_sparse.py                                   63       0  100.00%
packages/ragbits-core/tests/unit/__init__.py                                                                         0       0  100.00%
packages/ragbits-core/tests/unit/test_options.py                                                                    21       0  100.00%
packages/ragbits-core/tests/unit/audit/test_cli.py                                                                 107       0  100.00%
packages/ragbits-core/tests/unit/audit/test_metrics.py                                                              35       7  80.00%   14-19, 23
packages/ragbits-core/tests/unit/audit/test_trace.py                                                                98       3  96.94%   17, 20, 23
packages/ragbits-core/tests/unit/embeddings/test_bag_of_tokens.py                                                   52       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_fastembed.py                                                       50       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_from_config.py                                                     39       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_litellm.py                                                         84       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_local.py                                                           42       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_noop.py                                                            26       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_vector_size.py                                                     33       0  100.00%
packages/ragbits-core/tests/unit/embeddings/test_vertex_multimodal.py                                               39       0  100.00%
packages/ragbits-core/tests/unit/llms/__init__.py                                                                    0       0  100.00%
packages/ragbits-core/tests/unit/llms/test_base.py                                                                 196       3  98.47%   77-80
packages/ragbits-core/tests/unit/llms/test_from_config.py                                                           27       0  100.00%
packages/ragbits-core/tests/unit/llms/test_litellm.py                                                              216       3  98.61%   170-173
packages/ragbits-core/tests/unit/llms/test_local.py                                                                 74       0  100.00%
packages/ragbits-core/tests/unit/llms/factory/__init__.py                                                            0       0  100.00%
packages/ragbits-core/tests/unit/llms/factory/test_get_preferred_llm.py                                             12       0  100.00%
packages/ragbits-core/tests/unit/prompts/__init__.py                                                                 0       0  100.00%
packages/ragbits-core/tests/unit/prompts/test_parsers.py                                                            65       0  100.00%
packages/ragbits-core/tests/unit/prompts/test_prompt.py                                                            334       1  99.70%   777
packages/ragbits-core/tests/unit/prompts/discovery/__init__.py                                                       0       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/prompt_classes_for_tests.py                                      30       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/test_prompt_discovery.py                                         18       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/__init__.py                        2       1  50.00%   3
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/__init__.py                3       2  33.33%   2-4
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/temp_prompt1.py           14       0  100.00%
packages/ragbits-core/tests/unit/prompts/discovery/ragbits_tests_pkg_with_prompts/prompts/temp_prompt2.py           14       0  100.00%
packages/ragbits-core/tests/unit/sources/test_aws.py                                                                23       0  100.00%
packages/ragbits-core/tests/unit/sources/test_azure.py                                                              70       0  100.00%
packages/ragbits-core/tests/unit/sources/test_exceptions.py                                                         22       0  100.00%
packages/ragbits-core/tests/unit/sources/test_gcs.py                                                                33       6  81.82%   42-47
packages/ragbits-core/tests/unit/sources/test_git.py                                                               110       0  100.00%
packages/ragbits-core/tests/unit/sources/test_google_drive.py                                                      135      50  62.96%   27-32, 50, 64-102, 187-227
packages/ragbits-core/tests/unit/sources/test_hf.py                                                                 12       0  100.00%
packages/ragbits-core/tests/unit/sources/test_local.py                                                              13       0  100.00%
packages/ragbits-core/tests/unit/sources/test_source_discriminator.py                                               36       0  100.00%
packages/ragbits-core/tests/unit/sources/test_web.py                                                                43       0  100.00%
packages/ragbits-core/tests/unit/utils/__init__.py                                                                   0       0  100.00%
packages/ragbits-core/tests/unit/utils/test_config_handling.py                                                      76       2  97.37%   27-28
packages/ragbits-core/tests/unit/utils/test_decorators.py                                                           26       2  92.31%   17, 39
packages/ragbits-core/tests/unit/utils/test_dict_transformations.py                                                 98       0  100.00%
packages/ragbits-core/tests/unit/utils/test_function_schema.py                                                      16       2  87.50%   19, 32
packages/ragbits-core/tests/unit/utils/test_helpers.py                                                               6       0  100.00%
packages/ragbits-core/tests/unit/utils/test_secrets.py                                                              24       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_find.py                                                       13       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_get_config.py                                                  9       0  100.00%
packages/ragbits-core/tests/unit/utils/pyproject/test_get_instace.py                                                37       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_base.py                                                          6       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_chroma.py                                                       81       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_from_config.py                                                  55       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_hybrid.py                                                       74       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_hybrid_strategies.py                                            31       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_in_memory.py                                                   102       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_pgvector.py                                                    262       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_qdrant.py                                                      100       0  100.00%
packages/ragbits-core/tests/unit/vector_stores/test_weaviate.py                                                    142       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/__init__.py                                             2       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/_main.py                                               91       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/cli.py                                                 40       2  95.00%   86, 105
packages/ragbits-document-search/src/ragbits/document_search/documents/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/documents/document.py                                  78       2  97.44%   49, 93
packages/ragbits-document-search/src/ragbits/document_search/documents/element.py                                   86      14  83.72%   97, 115, 179-187, 197, 206-208
packages/ragbits-document-search/src/ragbits/document_search/ingestion/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/__init__.py                         4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/base.py                            21       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/exceptions.py                      14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/image.py                           30       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/enrichers/router.py                          25       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/__init__.py                           3       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/base.py                              28       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/docling.py                           48       4  91.67%   12-13, 100, 161
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/exceptions.py                        14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/router.py                            27       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/unstructured.py                      66      24  63.64%   102, 121-123, 135-156, 176-190, 212-213, 233-248
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/__init__.py                      8       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/callbacks.py                    10       1  90.00%   32
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/exceptions.py                   16      10  37.50%   25-33, 49-52
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/hyperlink_callback.py           38      12  68.42%   44-69, 72, 81, 84
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/metadata_callback.py            29       9  68.97%   52-71, 74
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/parser.py                       43       6  86.05%   60-62, 71-73
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/pptx/speaker_notes_callback.py       31      13  58.06%   41-68, 71
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/__init__.py                        5       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/base.py                          102      21  79.41%   152-156, 212-242, 284
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/batched.py                        69       8  88.41%   172, 200-215, 255-256
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/ray.py                            32       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/strategies/sequential.py                      4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/__init__.py                                   0       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/__init__.py                        4       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/base.py                           14       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/llm.py                            40       5  87.50%   51, 115-118
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rephrasers/noop.py                            8       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/__init__.py                         3       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/answerai.py                        29       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/base.py                            19       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/litellm.py                         27       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/llm.py                             59       1  98.31%   173
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/noop.py                            10       0  100.00%
packages/ragbits-document-search/src/ragbits/document_search/retrieval/rerankers/rrf.py                             28       2  92.86%   50, 60
packages/ragbits-document-search/tests/cli/custom_cli_source.py                                                     22       1  95.45%   32
packages/ragbits-document-search/tests/cli/test_ingest.py                                                           56       0  100.00%
packages/ragbits-document-search/tests/cli/test_search.py                                                           71       0  100.00%
packages/ragbits-document-search/tests/integration/__init__.py                                                       0       0  100.00%
packages/ragbits-document-search/tests/integration/test_docling.py                                                  10       0  100.00%
packages/ragbits-document-search/tests/integration/test_pptx_parser.py                                              54       9  83.33%   32-34, 52, 71, 74-75, 78-79
packages/ragbits-document-search/tests/integration/test_rerankers.py                                                32       9  71.88%   32-39, 59-64
packages/ragbits-document-search/tests/integration/test_unstructured.py                                             12       4  66.67%   62-67
packages/ragbits-document-search/tests/unit/test_config.py                                                          63       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_parser_router.py                                          24       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_parsers.py                                                47       0  100.00%
packages/ragbits-document-search/tests/unit/test_document_search.py                                                238       1  99.58%   480
packages/ragbits-document-search/tests/unit/test_document_search_ingest_errors.py                                   38       0  100.00%
packages/ragbits-document-search/tests/unit/test_documents.py                                                       13       0  100.00%
packages/ragbits-document-search/tests/unit/test_element_enricher_router.py                                         23       0  100.00%
packages/ragbits-document-search/tests/unit/test_element_enrichers.py                                               56       1  98.21%   25
packages/ragbits-document-search/tests/unit/test_elements.py                                                        21       0  100.00%
packages/ragbits-document-search/tests/unit/test_ingest_strategies.py                                               43       0  100.00%
packages/ragbits-document-search/tests/unit/test_llm_reranker.py                                                    43       0  100.00%
packages/ragbits-document-search/tests/unit/test_rephrasers.py                                                      26       0  100.00%
packages/ragbits-document-search/tests/unit/test_rerankers.py                                                       80       1  98.75%   25
packages/ragbits-document-search/tests/unit/testprojects/project_with_instance_factory/__init__.py                   0       0  100.00%
packages/ragbits-document-search/tests/unit/testprojects/project_with_instance_factory/factories.py                 22       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/__init__.py                                                           0       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/cli.py                                                               46       3  93.48%   133, 135, 137
packages/ragbits-evaluate/src/ragbits/evaluate/evaluator.py                                                         92       1  98.91%   221
packages/ragbits-evaluate/src/ragbits/evaluate/optimizer.py                                                         92      18  80.43%   162-168, 187, 190-191, 194, 198-204, 207-210
packages/ragbits-evaluate/src/ragbits/evaluate/utils.py                                                             58      37  36.21%   31-50, 62-69, 98-101, 117-129, 140-149, 159-160
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/__init__.py                                               2       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/base.py                                                  34       4  88.24%   58-60, 79
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/document_search.py                                       13       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/dataloaders/exceptions.py                                            10       5  50.00%   10-12, 21-25
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/__init__.py                                                   2       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/base.py                                                      27       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/metrics/document_search.py                                           23       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/__init__.py                                                17       3  82.35%   19-20, 36
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/base.py                                                    24       0  100.00%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/document_search.py                                         38       6  84.21%   68-71, 80-84
packages/ragbits-evaluate/tests/cli/test_run_evaluation.py                                                          25       0  100.00%
packages/ragbits-evaluate/tests/unit/test_evaluator.py                                                             103       0  100.00%
packages/ragbits-evaluate/tests/unit/test_metrics.py                                                                77       0  100.00%
packages/ragbits-evaluate/tests/unit/test_optimizer.py                                                              68       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/__init__.py                                                       0       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/base.py                                                          15       0  100.00%
packages/ragbits-guardrails/src/ragbits/guardrails/openai_moderation.py                                             19       6  68.42%   27-34
packages/ragbits-guardrails/tests/unit/test_openai_moderation.py                                                    35       0  100.00%
TOTAL                                                                                                            17124    2062  87.96%

Diff against main

Filename                                                                                     Stmts    Miss  Cover
-----------------------------------------------------------------------------------------  -------  ------  --------
packages/ragbits-agents/src/ragbits/agents/__init__.py                                          +1       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/_main.py                                            +11      +1  +0.25%
packages/ragbits-agents/src/ragbits/agents/confirmation.py                                      +2       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/history.py                                           +5       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/hooks/manager.py                                    +10       0  +0.25%
packages/ragbits-agents/src/ragbits/agents/tools/__init__.py                                    +2       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/tools/memory.py                                     +66       0  +100.00%
packages/ragbits-agents/src/ragbits/agents/tools/planning.py                                  +100     +64  +36.00%
packages/ragbits-agents/tests/unit/test_agent.py                                               +30       0  +0.02%
packages/ragbits-agents/tests/unit/test_history.py                                             +20       0  +100.00%
packages/ragbits-agents/tests/unit/hooks/test_manager.py                                       +29       0  +0.50%
packages/ragbits-agents/tests/unit/tools/test_memory.py                                        +94       0  +100.00%
packages/ragbits-chat/src/ragbits/chat/interface/_interface.py                                 +33      +1  +1.82%
packages/ragbits-chat/src/ragbits/chat/interface/types.py                                      +12      +5  -0.96%
packages/ragbits-chat/src/ragbits/chat/interface/ui_customization.py                            +2       0  +100.00%
packages/ragbits-chat/tests/unit/test_confirmation_resolution.py                               +62      +1  +98.39%
packages/ragbits-document-search/src/ragbits/document_search/ingestion/parsers/docling.py        0      +2  -4.16%
packages/ragbits-evaluate/src/ragbits/evaluate/pipelines/__init__.py                            +6      +2  -8.56%
TOTAL                                                                                         +485     +76  -0.10%

Results for commit: 1b3ae55

Minimum allowed coverage is 60%

♻️ This comment has been updated with latest results

…irmations

mkoruszowic added 6 commits April 19, 2026 17:42

fix confirmation hook matching across conversation turns with tool_na…

ea2972e

…me fallback

feat(agents): add Agent.execute_tool_directly for chat-layer confirma…

5d2fa2f

…tion resume

feat(agents): add inject_tool_call helper for synthetic tool-use hist…

1849030

…ory injection

feat(chat): add ChatInterface helpers to resolve pending confirmation…

2a41f77

…s via direct tool execution

docs(examples): update file_explorer_agent to resolve confirmations v…

41cceed

…ia direct execution

style: apply ruff formatting and lint fixes to new test files

c6526c2

mkoruszowic marked this pull request as draft April 19, 2026 16:29

mkoruszowic added 2 commits April 19, 2026 19:04

refactor(chat): mutate agent.history in place in resolve_pending_conf…

33ff7c2

…irmations

docs: add changelog entries for #969 confirmation flow fix

1b3ae55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve confirmed tools by direct execution to prevent LLM argument drift#972

fix: resolve confirmed tools by direct execution to prevent LLM argument drift#972
mkoruszowic wants to merge 8 commits into
mainfrom
mk/fix-confirmation-flow-969

mkoruszowic commented Apr 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mkoruszowic commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Chat-layer usage (one agent, minimal glue)

New public API

Example

Test plan

Follow-ups (not in this PR)

Uh oh!

github-actions Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Summary

Diff against main

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mkoruszowic commented Apr 19, 2026 •

edited

Loading

github-actions Bot commented Apr 19, 2026 •

edited

Loading