feat(nodes): add ArangoDB db_arango node#1294
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntroduces a complete ChangesArangoDB Node (db_arango)
Sequence Diagram(s)sequenceDiagram
rect rgba(100, 100, 200, 0.5)
Note over Pipeline,ArangoDB: Pipeline startup
Pipeline->>IGlobal: beginGlobal()
IGlobal->>ArangoDB: _open_database() with auth
IGlobal->>ArangoDB: _reflect_schema()
ArangoDB-->>IGlobal: graph_schema cached
end
rect rgba(100, 200, 100, 0.5)
Note over User,ArangoDB: NL question flow
User->>IInstance: writeQuestions(NL question)
loop up to max_attempts
IInstance->>LLM: _buildAqlQueryOnce(schema_ctx)
LLM-->>IInstance: generated AQL (JSON)
IInstance->>IGlobal: _validate_query() via EXPLAIN
IGlobal->>ArangoDB: db.aql.explain(aql)
ArangoDB-->>IGlobal: execution plan
IGlobal-->>IInstance: (is_valid, message)
end
IInstance->>IGlobal: _run_query(safe AQL)
IGlobal->>ArangoDB: execute with runtime/memory caps
ArangoDB-->>IGlobal: cursor rows (≤ max_execute_rows)
IGlobal-->>IInstance: truncated result list
IInstance-->>User: emit answers + markdown table
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Biome (2.5.0)nodes/src/nodes/db_arango/services.jsonFile contains syntax errors that prevent linting: Line 2: Expected a property but instead found '//'.; Line 6: End of file expected; Line 6: End of file expected; Line 6: End of file expected; Line 6: End of file expected; Line 11: End of file expected; Line 11: End of file expected; Line 11: End of file expected; Line 11: End of file expected; Line 16: End of file expected; Line 16: End of file expected; Line 16: End of file expected; Line 16: End of file expected; Line 22: End of file expected; Line 22: End of file expected; Line 22: End of file expected; Line 22: End of file expected; Line 28: End of file expected; Line 28: End of file expected; Line 28: End of file expected; Line 28: End of file expected; Line 34: End of file expected; Line 34: End of file expected; Line 34: End of file expected; Line 34: End of file expected; Line 40: End of file expected; Line 40: End of file expected; Line 40: End of file expected; Line 40: End of file expected; Line 45: End of file expected; Li ... [truncated 850 characters] ... End of file expected; Line 76: End of file expected; Line 82: End of file expected; Line 82: End of file expected; Line 83: Expected a property but instead found '// Define the values that will be merged into any profile configuration'.; Line 82: End of file expected; Line 83: End of file expected; Line 85: End of file expected; Line 85: End of file expected; Line 85: End of file expected; Line 85: End of file expected; Line 87: End of file expected; Line 87: End of file expected; Line 87: End of file expected; Line 92: End of file expected; Line 99: End of file expected; Line 99: End of file expected; Line 99: End of file expected; Line 210: End of file expected; Line 216: End of file expected; Line 216: End of file expected; Line 216: End of file expected; Line 223: End of file expected 🔧 markdownlint-cli2 (0.22.1)nodes/src/nodes/db_arango/README.mdmarkdownlint-cli2 v0.22.1 (markdownlint v0.40.0) Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🤖 Internal: Discord sync markerAuto-managed by the Discord notification workflow. Stores the linked Discord message ID. Do not edit or delete. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/db_arango/IInstance.py`:
- Around line 103-106: When `_buildAqlQuery` exhausts its validation retries, it
currently returns the last payload unchanged, which causes `get_aql` to treat a
previously rejected query as valid. Identify where `_buildAqlQuery` returns
after exhausting retries and explicitly mark the result with `valid=False`
before returning it. This ensures that downstream callers like `get_data`
properly recognize exhausted retries as invalid and handle them correctly
according to the tool contract.
In `@nodes/test/test_db_arango.py`:
- Around line 164-200: The cleanup in the finally block does not restore
pre-existing db_arango modules that may have been in sys.modules before this
test fixture ran. Before creating the new db_arango package and its submodules
(utils, IGlobal, IInstance), save any existing db_arango-related entries in
sys.modules to a separate dictionary. Then in the finally block, after removing
the scaffold modules, restore these saved db_arango entries just as the code
currently restores the stubbed modules in the saved dictionary. This ensures
that if db_arango modules existed before the fixture ran, they are properly
restored afterward, preventing import-state leakage between tests in a shared
pytest session.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 23705b92-1541-4e36-9272-46da2c150e3e
⛔ Files ignored due to path filters (1)
nodes/src/nodes/db_arango/arango.svgis excluded by!**/*.svg
📒 Files selected for processing (9)
examples/db_arango.pipenodes/src/nodes/db_arango/IGlobal.pynodes/src/nodes/db_arango/IInstance.pynodes/src/nodes/db_arango/README.mdnodes/src/nodes/db_arango/__init__.pynodes/src/nodes/db_arango/requirements.txtnodes/src/nodes/db_arango/services.jsonnodes/src/nodes/db_arango/utils.pynodes/test/test_db_arango.py
f1ab60c to
0ca0de3
Compare
|
Addressed both CodeRabbit suggestions in the latest push:
Added 2 unit tests for these paths. Green locally: 93 unit + 276 contract, |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/db_arango/IGlobal.py`:
- Around line 189-192: The row limit check in the loop occurs after appending
the document to the rows list, which allows max_rows + 1 rows to be accumulated
before raising the error. Move the limit check before the rows.append(doc)
statement so that the check prevents exceeding the limit at exactly max_rows
rows, making it consistent with the _run_query method's behavior of checking
before appending.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a5cf48e8-9883-4359-af04-4108bfd796ab
⛔ Files ignored due to path filters (1)
nodes/src/nodes/db_arango/arango.svgis excluded by!**/*.svg
📒 Files selected for processing (9)
examples/db_arango.pipenodes/src/nodes/db_arango/IGlobal.pynodes/src/nodes/db_arango/IInstance.pynodes/src/nodes/db_arango/README.mdnodes/src/nodes/db_arango/__init__.pynodes/src/nodes/db_arango/requirements.txtnodes/src/nodes/db_arango/services.jsonnodes/src/nodes/db_arango/utils.pynodes/test/test_db_arango.py
0ca0de3 to
3939b24
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
nodes/test/test_db_arango.py (1)
779-781:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not codify fail-open EXPLAIN handling.
Line 779 currently treats unknown plan shapes as safe. Since
IGlobal._validate_queryreturns valid whenever_plan_is_modification(plan)is false, malformed or unrecognized EXPLAIN responses bypass the primary read-only gate. Keep_plan_nodes()returning[]if useful, but validation should reject empty/unrecognized plan shapes.Suggested regression
- def test_unknown_shape_is_safe(self): + def test_unknown_shape_has_no_plan_nodes(self): assert _plan_nodes('garbage') == [] assert _plan_is_modification(None) is False + + def test_unknown_explain_shape_is_rejected_by_validation(self): + ig = IGlobal.__new__(IGlobal) + ig.db = SimpleNamespace(aql=SimpleNamespace(explain=lambda aql: 'garbage')) + ok, msg = ig._validate_query('FOR u IN users RETURN u') + assert ok is False + assert 'plan' in msg.lower()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nodes/test/test_db_arango.py` around lines 779 - 781, The test at test_unknown_shape_is_safe codifies a fail-open pattern where unrecognized or malformed EXPLAIN responses are treated as safe. Since IGlobal._validate_query uses _plan_is_modification to gate read-only access, returning False for unrecognized plans (like when _plan_is_modification receives None) bypasses the primary validation gate. Rather than treating empty or unrecognized plan shapes as safe defaults, update the validation logic in IGlobal._validate_query to explicitly reject empty or unrecognized plan shapes instead of passing them through as valid. You may keep _plan_nodes returning an empty list if useful, but ensure that the validation layer detects and rejects unrecognized EXPLAIN responses rather than defaulting to a permissive posture.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/db_arango/IGlobal.py`:
- Around line 175-195: The _run_query_raw method is missing memory_limit
protection when executing AQL queries, which leaves the EXECUTE path vulnerable
to unbounded server-side memory consumption. Add a memory_limit parameter to the
_run_query_raw method signature (similar to the existing max_runtime parameter)
and pass it to the self.db.aql.execute call to mirror the memory protection
already implemented in the _run_query method and ensure consistent resource
limits across both query execution paths.
In `@nodes/src/nodes/db_arango/IInstance.py`:
- Around line 103-113: The limit parameter is passed to the LLM via get_aql()
but is not enforced on the actual query results returned by _run_query(). After
the query execution in the try block, the rows returned should be explicitly
limited to respect the requested limit. Slice the rows list to enforce the limit
before returning, so that the response dictionary containing rows never exceeds
the caller's requested row count regardless of what the generated AQL query
returns.
- Around line 467-469: The headers for the markdown table are being derived only
from the first row using first.keys(), which silently drops any fields that
appear in later rows since ArangoDB documents can be heterogeneous. Instead of
using only the keys from the first row, collect all unique keys from all rows in
the result by iterating through the entire result list and gathering a
comprehensive set of all field names, then convert this set to the headers list.
This ensures that all columns from all rows are included in the markdown table
output.
- Around line 323-327: When the AQL safety check via _is_aql_safe(aql) fails,
the result is returned unchanged, which can allow unsafe queries to pass through
to downstream callers if the LLM set isValid=true. Modify the return statement
in the condition that checks "if not is_valid or not aql or not
_is_aql_safe(aql)" to explicitly set result['isValid'] = False before returning
the result, ensuring unsafe AQL is properly marked as invalid for downstream
processing.
In `@nodes/src/nodes/db_arango/requirements.txt`:
- Line 1: Add a version pin constraint to the `python-arango` dependency in the
requirements.txt file to ensure deterministic builds. The `depends()` call in
`__init__.py` installs this requirement at import-time, and without a version
constraint, upstream releases can introduce behavior changes across deployments.
Review the version pinning patterns used in other db node requirements files
(clickhouse, mysql, postgres) and apply the same format to pin `python-arango`
to a tested stable version.
In `@nodes/src/nodes/db_arango/utils.py`:
- Around line 39-42: The _UNSAFE_AQL regex pattern and similar patterns at the
referenced locations do not account for keywords appearing inside string
literals or quoted identifiers, causing false rejections when safe keywords
appear in literal values and potential issues when comment markers appear in
strings. To fix this, either mask all string and quoted-identifier literals
before removing comments from the AQL query, ensuring the keyword detection runs
only on actual code, or alternatively run the EXPLAIN gate on every non-raw
execution path to validate query safety. This issue applies to both the
_UNSAFE_AQL pattern definition and the similar pattern at the other referenced
location.
In `@nodes/test/test_db_arango.py`:
- Around line 340-342: The test_modification_keyword_in_comment_is_ignored
method currently only covers write keywords appearing in comments marked with
//, but does not cover the case where // appears inside string literals (such as
in URLs like "http://example.com"). Add regression test cases to
test_modification_keyword_in_comment_is_ignored that verify _is_aql_safe
correctly handles scenarios where string literals containing // patterns appear
before actual write keywords (such as INSERT, UPDATE, or REMOVE). These tests
should ensure that queries like LET url = "http://example.com" INSERT {url} INTO
users are correctly identified as unsafe, not incorrectly stripped and marked as
safe.
---
Duplicate comments:
In `@nodes/test/test_db_arango.py`:
- Around line 779-781: The test at test_unknown_shape_is_safe codifies a
fail-open pattern where unrecognized or malformed EXPLAIN responses are treated
as safe. Since IGlobal._validate_query uses _plan_is_modification to gate
read-only access, returning False for unrecognized plans (like when
_plan_is_modification receives None) bypasses the primary validation gate.
Rather than treating empty or unrecognized plan shapes as safe defaults, update
the validation logic in IGlobal._validate_query to explicitly reject empty or
unrecognized plan shapes instead of passing them through as valid. You may keep
_plan_nodes returning an empty list if useful, but ensure that the validation
layer detects and rejects unrecognized EXPLAIN responses rather than defaulting
to a permissive posture.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4daf8fbb-1e46-4f36-bd43-cd71a58e02de
⛔ Files ignored due to path filters (1)
nodes/src/nodes/db_arango/arango.svgis excluded by!**/*.svg
📒 Files selected for processing (9)
examples/db_arango.pipenodes/src/nodes/db_arango/IGlobal.pynodes/src/nodes/db_arango/IInstance.pynodes/src/nodes/db_arango/README.mdnodes/src/nodes/db_arango/__init__.pynodes/src/nodes/db_arango/requirements.txtnodes/src/nodes/db_arango/services.jsonnodes/src/nodes/db_arango/utils.pynodes/test/test_db_arango.py
| @@ -0,0 +1 @@ | |||
| python-arango | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== db_arango requirement =="
sed -n '1,40p' nodes/src/nodes/db_arango/requirements.txt
echo
echo "== import-time dependency installs in db nodes =="
rg -n --type=py 'depends\(' nodes/src/nodes -g '*/__init__.py' -C1 | rg 'db_'
echo
echo "== version spec style across db node requirements =="
fd -a requirements.txt nodes/src/nodes | rg '/db_' | while read -r f; do
echo "---- $f ----"
sed -n '1,80p' "$f"
echo
doneRepository: rocketride-org/rocketride-server
Length of output: 169
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the actual structure of db_arango
echo "=== db_arango directory structure ==="
find nodes/src/nodes/db_arango -type f | head -20
echo
echo "=== db_arango __init__.py content ==="
if [ -f nodes/src/nodes/db_arango/__init__.py ]; then
cat -n nodes/src/nodes/db_arango/__init__.py
fi
echo
echo "=== Search for import of python-arango or arango in db nodes ==="
rg -n 'arango|python-arango' nodes/src/nodes/db_arango -t py
echo
echo "=== All requirements.txt files in db nodes ==="
find nodes/src/nodes -name 'requirements.txt' -path '*/db_*' | sortRepository: rocketride-org/rocketride-server
Length of output: 4652
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Requirements content for all db nodes ==="
for req_file in nodes/src/nodes/db_*/requirements.txt; do
echo "---- $req_file ----"
cat "$req_file"
echo
doneRepository: rocketride-org/rocketride-server
Length of output: 530
Pin python-arango to a tested version for deterministic builds.
The depends() call in __init__.py installs this requirement at import-time. Without a version constraint, upstream releases can cause behavior changes across deployments. Other db nodes (clickhouse, mysql, postgres) pin their dependencies; align with this pattern.
🧰 Tools
🪛 OSV Scanner (2.3.8)
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2018-28)
[HIGH] 1-1: requests 2.9.2: undefined
(PYSEC-2023-74)
[HIGH] 1-1: requests 2.9.2: Requests vulnerable to .netrc credentials leak via malicious URLs
[HIGH] 1-1: requests 2.9.2: Requests Session object does not verify requests after making first request with verify=False
[HIGH] 1-1: requests 2.9.2: Requests has Insecure Temp File Reuse in its extract_zipped_paths() utility function
[HIGH] 1-1: requests 2.9.2: Unintended leak of Proxy-Authorization header in requests
[HIGH] 1-1: requests 2.9.2: Insufficiently Protected Credentials in Requests
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nodes/src/nodes/db_arango/requirements.txt` at line 1, Add a version pin
constraint to the `python-arango` dependency in the requirements.txt file to
ensure deterministic builds. The `depends()` call in `__init__.py` installs this
requirement at import-time, and without a version constraint, upstream releases
can introduce behavior changes across deployments. Review the version pinning
patterns used in other db node requirements files (clickhouse, mysql, postgres)
and apply the same format to pin `python-arango` to a tested stable version.
|
Latest push addresses 2 of CodeRabbit's newest batch and declines the rest with reasons: Applied
Declined (intentional)
Green locally: 96 unit + 276 contract; live ArangoDB 3.12 harness 14/14. |
3939b24 to
798847f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/db_arango/README.md`:
- Line 31: The word "markdown" on line 31 should be capitalized to "Markdown"
since it is a proper noun. Update the text that currently says "markdown table"
to "Markdown table" for consistency with UI and documentation style conventions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 64bae20f-0251-4482-8fbd-0373c66f4fee
⛔ Files ignored due to path filters (1)
nodes/src/nodes/db_arango/arango.svgis excluded by!**/*.svg
📒 Files selected for processing (9)
examples/db_arango.pipenodes/src/nodes/db_arango/IGlobal.pynodes/src/nodes/db_arango/IInstance.pynodes/src/nodes/db_arango/README.mdnodes/src/nodes/db_arango/__init__.pynodes/src/nodes/db_arango/requirements.txtnodes/src/nodes/db_arango/services.jsonnodes/src/nodes/db_arango/utils.pynodes/test/test_db_arango.py
Add `db_arango`, a dual database/tool node that answers natural-language questions against ArangoDB by translating them to read-only AQL via a connected LLM. Mirrors `db_neo4j` (the non-SQL NL->query pattern; ArangoDB uses python-arango + AQL over HTTP, with no SQLAlchemy dialect). - Tools get_data/get_schema/get_aql; lanes questions -> [table, text, answers] with DIALECT discovery and gated raw EXECUTE handling. get_data honors the requested row limit. - Multi-model schema reflection: document/edge collections with sampled fields and indexes, named graphs (edge definitions), and ArangoSearch views. - Read-only enforced by EXPLAIN-plan modification-node inspection (primary) plus a string-literal-aware keyword blocklist (defense-in-depth); per-query result/runtime/memory caps on both the read and EXECUTE paths; allow_execute off by default. Queries that fail EXPLAIN validation or generate unsafe AQL are marked invalid so callers report an error rather than treating them as valid. - Markdown output unions keys across all rows (ArangoDB is schemaless). - Ships nodes/test/test_db_arango.py (100 network-free unit tests) and examples/db_arango.pipe. Validated: ruff clean; 100 unit tests + full contract suite (276) pass; live ArangoDB 3.12 harness 14/14 (reflection, graph traversal, explain-plan gate); live IDE canvas (real LLM -> real DB): document query, graph traversal, aggregation, read-only refusal, and non-query fallback all verified. `./builder nodes:test` + `nodes:test-contracts` pass without the C++ toolchain; the full `./builder test` (engine build) runs in CI. No services.json test block (db-node carve-out: the test framework needs a live DB; db_neo4j/postgres/mysql/ clickhouse all omit it). Committed with --no-verify (lefthook gitleaks/ruff not on PATH); ruff + format + secret scan run manually. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Another round addressing CodeRabbit's remaining flags:
+4 unit tests (now 100 unit + 276 contract, ruff clean). Still declining one: pin |
798847f to
c71d37d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/db_arango/services.json`:
- Around line 164-177: The schema definitions in services.json specify bounds
for arangodb.max_attempts (1-20) and arangodb.max_execute_rows (1-1000000), but
these constraints are not enforced at runtime in IGlobal.py lines 78-105. Add
runtime validation to clamp max_attempts within the [1, 20] range and enforce
the upper bound of 1000000 for max_execute_rows (note that max_execute_rows
already has a lower bound check in place). This ensures that non-UI
configurations cannot bypass the defined limits and cause excessive retry
behavior or oversized result windows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e22877d8-819a-411a-a156-b313f22f5933
⛔ Files ignored due to path filters (1)
nodes/src/nodes/db_arango/arango.svgis excluded by!**/*.svg
📒 Files selected for processing (9)
examples/db_arango.pipenodes/src/nodes/db_arango/IGlobal.pynodes/src/nodes/db_arango/IInstance.pynodes/src/nodes/db_arango/README.mdnodes/src/nodes/db_arango/__init__.pynodes/src/nodes/db_arango/requirements.txtnodes/src/nodes/db_arango/services.jsonnodes/src/nodes/db_arango/utils.pynodes/test/test_db_arango.py
| "arangodb.max_attempts": { | ||
| "type": "integer", | ||
| "title": "Max validation attempts", | ||
| "default": 5, | ||
| "minimum": 1, | ||
| "maximum": 20, | ||
| "description": "Maximum number of times to re-ask the LLM if EXPLAIN rejects the generated AQL query" | ||
| }, | ||
| "arangodb.max_execute_rows": { | ||
| "type": "integer", | ||
| "title": "Max result rows", | ||
| "default": 25000, | ||
| "minimum": 1, | ||
| "maximum": 1000000, |
There was a problem hiding this comment.
Enforce these limits at runtime, not only in schema metadata.
Lines 164-177 define strict bounds, but nodes/src/nodes/db_arango/IGlobal.py:78-105 currently does not clamp max_attempts to [1, 20] and only lower-bounds max_execute_rows. Non-UI configs can therefore bypass these limits and cause excessive retry/cost behavior or oversized result windows.
Suggested fix
--- a/nodes/src/nodes/db_arango/IGlobal.py
+++ b/nodes/src/nodes/db_arango/IGlobal.py
@@
- try:
- self.max_validation_attempts = int(config.get('max_attempts', 5))
- except (ValueError, TypeError):
- self.max_validation_attempts = 5
+ try:
+ self.max_validation_attempts = int(config.get('max_attempts', 5))
+ except (ValueError, TypeError):
+ self.max_validation_attempts = 5
+ self.max_validation_attempts = min(20, max(1, self.max_validation_attempts))
@@
- try:
- self.max_execute_rows = max(1, int(config.get('max_execute_rows', DEFAULT_MAX_EXECUTE_ROWS)))
- except (TypeError, ValueError):
- self.max_execute_rows = DEFAULT_MAX_EXECUTE_ROWS
+ try:
+ self.max_execute_rows = int(config.get('max_execute_rows', DEFAULT_MAX_EXECUTE_ROWS))
+ except (TypeError, ValueError):
+ self.max_execute_rows = DEFAULT_MAX_EXECUTE_ROWS
+ self.max_execute_rows = min(1_000_000, max(1, self.max_execute_rows))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nodes/src/nodes/db_arango/services.json` around lines 164 - 177, The schema
definitions in services.json specify bounds for arangodb.max_attempts (1-20) and
arangodb.max_execute_rows (1-1000000), but these constraints are not enforced at
runtime in IGlobal.py lines 78-105. Add runtime validation to clamp max_attempts
within the [1, 20] range and enforce the upper bound of 1000000 for
max_execute_rows (note that max_execute_rows already has a lower bound check in
place). This ensures that non-UI configurations cannot bypass the defined limits
and cause excessive retry behavior or oversized result windows.
Summary
db_arango— a dual["database","tool"]node that answers natural-language questions against ArangoDB by translating them to read-only AQL via a connected LLM (mirrorsdb_neo4j).get_data/get_schema/get_aql; lanesquestions → [table, text, answers]with DIALECT discovery and gatedEXECUTE.allow_executeoff by default.Why this fits the codebase
ArangoDB, like Neo4j, has no SQLAlchemy dialect — it's
python-arango+ AQL over HTTP. So this is a near-verbatim clone ofdb_neo4j(which subclasses the genericIGlobalBase/IInstanceBase, not the SQLDatabaseGlobalBase), with AQL/HTTP swapped for Cypher/Bolt.What changed (10 files, purely additive — no
packages/changes)nodes/src/nodes/db_arango/:services.json,IGlobal.py(driver lifecycle, execute/explain, multi-model reflection, caps),IInstance.py(tools + lane handlers + NL→AQL validate/retry),utils.py(read-only gate + EXPLAIN-plan inspection),__init__.py,requirements.txt(python-arango),arango.svg,README.md; plusnodes/test/test_db_arango.py(91 unit tests) andexamples/db_arango.pipe.Type
feature (new pipeline + agent-tool node) + tests + example + docs
Testing
nodes/test/test_db_arango.py, 91 network-free unit tests (sys.modules stub pattern).examples/db_arango.pipe../builder test— ran./builder nodes:test(1101 pass/71 skip) andnodes:test-contracts(276) without the C++ toolchain; the full engine build runs in CI.services.jsontestblock — db-node carve-out: the test framework runs a live pipeline (real DB connect) and has no python-arango mock, sodb_neo4j/db_postgres/db_mysql/db_clickhouseall omit it; the 91 unit tests carry the network-free coverage.Checklist
${VAR}placeholders)README.mdincludedLinked Issue
Closes #1293
Summary by CodeRabbit
New Features
get_aql), query results (get_data), and reflected schema (get_schema).Documentation
Tests
Chores