feat: add local E2E sanity test harness#461
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review.
Tip: disable this comment in your organization's Code Review settings.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a local sanity CI system: npm scripts to run sanity flows, Dockerfiles and a Docker Compose environment for multi‑DB testing, many Bash helpers and phased test scripts (verify-install, smoke, resilience, verify-upgrade), PR-specific test generation/runner, fixtures, and orchestrating entrypoints. Changes
Sequence Diagram(s)sequenceDiagram
participant CI as CI Script (ci-local.sh)
participant Compose as Docker Compose
participant DB as DB Services (Pg/MySQL/MSSQL/Redshift)
participant Sanity as Sanity Container
participant CLI as Altimate CLI
participant Assert as Assertion Libs
CI->>CI: choose mode (fast / full / pr / upgrade)
CI->>Compose: docker compose up --build (full / upgrade)
Compose->>DB: start DB services & run healthchecks
DB-->>Compose: healthy
Compose->>Sanity: start `sanity` service (depends_on healthy)
Sanity->>CLI: run phase scripts (verify-install, smoke, resilience, verify-upgrade)
CLI->>DB: execute queries / integrations
DB-->>CLI: return results
CLI-->>Sanity: emit outputs / exit codes
Sanity->>Assert: validate outputs, write per-test results
Sanity-->>CI: exit code / container result
CI->>Compose: docker compose down --volumes
CI->>CI: aggregate + report final result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
467b5ab to
09ee72c
Compare
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (11)
test/sanity/lib/assert.sh (2)
19-28: Same$?issue applies here, though less impactful.For
assert_exit_nonzero, the exit code isn't printed so it's less visible, but for consistency you may want to capture it if you ever want to log what the actual exit code was.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/assert.sh` around lines 19 - 28, In assert_exit_nonzero, capture the command's exit code and include it in the FAIL message for consistency: run the command (still redirecting output to /dev/null 2>&1), store its exit status in a variable (e.g., rc = $?), then test rc to decide PASS/FAIL and use rc in the failure echo; refer to the assert_exit_nonzero function and the rc variable when making the change.
80-91: Integer comparison may fail silently on non-numeric input.The
2>/dev/nullsuppresses errors when$actualor$expectedare not valid integers, causing the assertion to fail silently with a misleading message. Consider validating inputs or using a more explicit error message.Proposed improvement
assert_ge() { local actual="$1" local expected="$2" local desc="$3" - if [ "$actual" -ge "$expected" ] 2>/dev/null; then + if ! [[ "$actual" =~ ^[0-9]+$ ]] || ! [[ "$expected" =~ ^[0-9]+$ ]]; then + echo " FAIL: $desc (non-numeric: actual='$actual', expected='$expected')" + ((FAIL_COUNT++)) + elif [ "$actual" -ge "$expected" ]; then echo " PASS: $desc" ((PASS_COUNT++)) else echo " FAIL: $desc (got $actual, expected >= $expected)" ((FAIL_COUNT++)) fi }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/assert.sh` around lines 80 - 91, The assert_ge function currently silences numeric errors via "2>/dev/null", which hides non-numeric input problems; update assert_ge to validate that both "$actual" and "$expected" are valid integers (e.g., using a regex like '^[+-]?[0-9]+$' with [[ ... =~ ... ]]) before performing the numeric comparison, and if validation fails emit a clear error/FAIL message that includes the offending values and increment FAIL_COUNT instead of hiding the error; keep the numeric comparison logic (the -ge test) but remove the error suppression so failures are explicit and the function name assert_ge is still used.test/sanity/lib/parallel.sh (2)
4-5: Declare and assign separately to avoid masking return values.The
localdeclaration masks the exit code of the command substitution. Ifnprocorsysctlfails silently, you won't detect it. While the fallback chain handles this functionally, separating declaration and assignment is cleaner.Proposed fix
detect_parallelism() { - local cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2") + local cores + cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 4 - 5, In detect_parallelism(), don't declare and assign cores in one local statement because the command substitution masks its exit status; instead, declare local cores on its own, then perform the fallback command substitution into cores and check the exit status (or use a conditional fallback chain) so failures of nproc/sysctl are detectable and handled; update references to the variable name cores accordingly and ensure the function returns a non-zero status when all detection attempts fail.
46-65: Batch completion waits for all jobs before starting next batch.The current implementation waits for the entire batch to complete before starting new jobs. This means if one job finishes quickly, its slot stays idle until the slowest job in the batch completes. A more efficient approach would use a job slot pool, but for a test harness with ~10 smoke tests, this is likely acceptable.
Alternative: slot-based parallelism (optional enhancement)
For future consideration if test times vary significantly:
run_parallel() { local max_parallel="$1"; shift local results=() for cmd in "$@"; do # Wait if at capacity while [ $(jobs -r | wc -l) -ge "$max_parallel" ]; do wait -n 2>/dev/null || sleep 0.1 done eval "$cmd" & results+=($!) done # Wait for all remaining local failed=0 for pid in "${results[@]}"; do wait "$pid" || ((failed++)) done [ "$failed" -eq 0 ] }Note:
wait -nrequires Bash 4.3+, which may not be available in all environments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 46 - 65, The run_parallel function currently batches commands and waits for all pids in pids[] to finish before starting the next batch, which leaves slots idle; change run_parallel to use slot-based concurrency: before launching each command check running count against max_parallel and if at capacity wait for any single job to finish (use wait -n to reap one child and capture its exit status into results[], removing its pid from pids[]), falling back to a small sleep+jobs -r loop if wait -n isn’t available; after dispatching all commands, wait for remaining pids in pids[] and return success/failure based on collected results[]. Ensure you update handling of results[] and remove the batch counter variable if unused.test/sanity/phases/resilience.sh (2)
86-96: API key restoration may not happen on unexpected exit.If the script exits between
unset ANTHROPIC_API_KEY(line 89) and the restoration (line 95), the key remains unset for subsequent processes. Consider using a subshell or trap.♻️ Proposed fix using subshell
# 6. Graceful on missing provider key echo " [6/7] Missing API key handling..." -SAVED_KEY="${ANTHROPIC_API_KEY:-}" -unset ANTHROPIC_API_KEY -OUTPUT=$(timeout 10 altimate run --max-turns 1 --yolo "hello" 2>&1 || true) +OUTPUT=$( + unset ANTHROPIC_API_KEY + timeout 10 altimate run --max-turns 1 --yolo "hello" 2>&1 || true +) # Should get a clean error, not an unhandled exception / stack trace assert_not_contains "$OUTPUT" "TypeError" "no TypeError on missing key" assert_not_contains "$OUTPUT" "Cannot read properties" "no unhandled error on missing key" -if [ -n "$SAVED_KEY" ]; then - export ANTHROPIC_API_KEY="$SAVED_KEY" -fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 86 - 96, The test unsets ANTHROPIC_API_KEY and only restores it later, which can leave the environment modified if the script exits early; update the block around SAVED_KEY/ANTHROPIC_API_KEY so the original key is always restored on exit by either executing the unset/restore inside a subshell or installing a trap that captures the original value (SAVED_KEY) and resets ANTHROPIC_API_KEY on EXIT; locate and modify the section that declares SAVED_KEY, unsets ANTHROPIC_API_KEY, runs altimate, and restores the key to ensure restoration happens even on unexpected exits.
18-18: Add error handling forcdcommand.Per shellcheck SC2164, if the
cdfails (e.g.,mktempreturned a path that was immediately deleted), subsequent commands would run in an unexpected directory.♻️ Proposed fix
WORKDIR=$(mktemp -d /tmp/sanity-resilience-XXXXXX) -cd "$WORKDIR" +cd "$WORKDIR" || { echo "FAIL: could not cd to $WORKDIR"; exit 1; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` at line 18, The lone cd "$WORKDIR" lacks error handling and can silently fail; update the script to check the exit status of cd (the cd "$WORKDIR" invocation) and abort the script on failure — for example, immediately follow cd "$WORKDIR" with a failing-exit branch that prints an explanatory error to stderr and exits non‑zero (or use a helper die/err function if one exists) so subsequent commands never run in the wrong directory.test/sanity/ci-local.sh (1)
52-72: Docker cleanup may not run on early failure.If any step between lines 53-70 fails and causes an early exit (despite
set -ebeing disabled byrun_step), the cleanup on line 72 won't execute, leaving containers running. Consider using a trap for cleanup.Also, the hardcoded
sleep 10on line 56 is fragile—services may need more or less time depending on the host.♻️ Proposed fix with trap-based cleanup
if [ "$MODE" = "--full" ] || [ "$MODE" = "full" ]; then echo "" echo "=== Full CI (Docker) ===" + cleanup_docker() { + docker compose -f "$REPO_ROOT/test/sanity/docker-compose.yml" down --volumes --remove-orphans 2>/dev/null || true + } + trap cleanup_docker EXIT + # Driver E2E with Docker containers run_step "Docker Services Up" docker compose -f "$REPO_ROOT/test/sanity/docker-compose.yml" up -d postgres mysql mssql redshift - echo " Waiting for services to be healthy..." - sleep 10 + echo " Waiting for services to be healthy (up to 30s)..." + for i in {1..30}; do + if docker compose -f "$REPO_ROOT/test/sanity/docker-compose.yml" ps --status running | grep -q postgres; then + break + fi + sleep 1 + done # ... tests ... - docker compose -f "$REPO_ROOT/test/sanity/docker-compose.yml" down --volumes --remove-orphans 2>/dev/null + trap - EXIT + cleanup_docker fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/ci-local.sh` around lines 52 - 72, Wrap the docker bring-up and test steps in a guarded section that installs a trap to always run the existing cleanup command (docker compose ... down --volumes --remove-orphans) on EXIT, so containers are removed even if any run_step or tests fail; specifically, add a trap that calls the same cleanup command before running run_step "Docker Services Up" and the subsequent run_step test blocks (referencing the run_step invocation names and the docker compose command from the diff). Replace the fixed sleep 10 with a retrying health-check loop that polls container health or attempts TCP connects to TEST_*_PORTs with a timeout/attempt limit (used before running the local and docker E2E run_step tests) so the script waits deterministically for services instead of sleeping.test/sanity/phases/smoke-tests.sh (2)
52-61: Separate declaration from assignment to avoid masking errors.The pattern
local output=$(get_output ...)masks the exit code ofget_output. If the function fails, the error is silently ignored. This applies to multiple test functions (lines 55, 72, 83, 114, 125, 136, 147).♻️ Example fix for test_discover_mcps
test_discover_mcps() { - cd "$WORKDIR" + cd "$WORKDIR" || return 1 altimate_run "discover-mcps" --command discover-and-add-mcps "list" - local output=$(get_output "discover-mcps") + local output + output=$(get_output "discover-mcps") if echo "$output" | grep -qi "command not found\|Unknown command"; thenApply similar changes to other test functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 52 - 61, The test functions (e.g., test_discover_mcps) currently use combined declaration+assignment like `local output=$(get_output ...)` which masks get_output's exit status; change each to declare the variable first (e.g., `local output`) then run the assignment as a separate statement so you can capture and check the command's exit code (inspect `$?` or use `if ! output=$(get_output ...) ; then ... fi`) and fail the test early when get_output errors before grepping; apply the same pattern to the other affected functions listed in the comment.
212-233: Remove unused variable and improve robustness.Line 214 calculates
numbut it's never used in the loop. This appears to be dead code.Additionally, the static analysis warnings about
cdwithout error handling (SC2164) and combined declare+assign (SC2155) are valid. While the risk is low in this test context, separating declaration from assignment prevents masking failures fromget_output.♻️ Proposed fix for result collection
# Collect results echo "" echo " Results:" for name in "${TEST_NAMES[@]}"; do result=$(cat "$RESULTS_DIR/$name" 2>/dev/null || echo "MISSING") - num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1)) case "$result" in🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 212 - 233, Remove the unused dead variable `num` computed in the loop (derived from `TESTS`/`TEST_NAMES`) and simplify the loop to only read results from `RESULTS_DIR` for each `name` in `TEST_NAMES`; also fix shellcheck warnings by not mixing declaration+assignment when capturing output from helper functions (separate declaration and assignment where `get_output` is used) and add explicit error handling for any `cd` calls used earlier (e.g., check `cd` return or use pushd/popd) so failures are not masked; locate changes around the loop that references TEST_NAMES/RESULTS_DIR and update related counters PASS_COUNT/FAIL_COUNT/SKIP_COUNT as-is.test/sanity/Dockerfile (1)
4-6: Add--no-install-recommendsto reduce image size.The static analysis correctly identifies that
--no-install-recommendsis missing. This flag prevents installation of unnecessary recommended packages, reducing image size and attack surface.♻️ Proposed fix
# System deps (what a real user would have) -RUN apt-get update && apt-get install -y \ +RUN apt-get update && apt-get install -y --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 4 - 6, The apt-get install command in the Dockerfile RUN line should include --no-install-recommends to avoid pulling recommended packages; update the RUN instruction that executes "apt-get update && apt-get install -y git python3 python3-pip python3-venv curl sqlite3" to add the --no-install-recommends flag (preserving the existing cleanup "&& rm -rf /var/lib/apt/lists/*") so the image size and attack surface are reduced.test/sanity/pr-tests/generate.sh (1)
9-9: Use explicit no-op for file truncation.The bare redirection
> "$MANIFEST"works but is flagged by shellcheck (SC2188). Using an explicit command improves clarity.♻️ Proposed fix
-> "$MANIFEST" +: > "$MANIFEST"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` at line 9, Replace the bare redirection that truncates the file (" > \"$MANIFEST\"") with an explicit no-op redirection to satisfy shellcheck SC2188; e.g., use the null command prefix (colon) to truncate the file (: > "$MANIFEST") so the truncation is explicit—update the occurrence that references the MANIFEST variable in generate.sh.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/Dockerfile`:
- Around line 32-35: The Dockerfile currently hardcodes the architecture and
binary name (copying from
dist/@altimateai/altimate-code-linux-arm64/bin/altimate), which breaks on x86_64
and doesn't match the postinstall symlink naming; update the RUN step to detect
platform/arch the same way as postinstall.mjs and copy the actual binary target
(follow the postinstall symlink name like bin/.altimate-code -> altimate-code)
into /home/testuser/.local/bin/altimate, ensuring you compute the correct dist
directory name (e.g., altimate-code-<platform>-<arch> or resolve the
.altimate-code symlink) instead of using the hardcoded altimate-code-linux-arm64
and adjust the source path to the real binary (altimate-code) before setting
executable permissions.
In `@test/sanity/Dockerfile.upgrade`:
- Around line 17-18: The Dockerfile uses ARG PRIOR_VERSION=latest which makes
the upgrade baseline implicit and fragile; change the Dockerfile.upgrade to
require an explicit build arg (use ARG PRIOR_VERSION without a default) and keep
the RUN npm install -g `@altimateai/altimate-code`@${PRIOR_VERSION} line as-is,
then update your CI/compose build invocation to pass a concrete PRIOR_VERSION
build-arg (e.g., via --build-arg PRIOR_VERSION=<version>) so tests always run
against a reproducible prior release.
In `@test/sanity/fixtures/compaction-session.sql`:
- Around line 6-7: The seed INSERT into the session table is missing the
required NOT NULL slug column so the row is ignored; update the INSERT statement
for session (the INSERT OR IGNORE INTO session (...) VALUES (...)) to include
the slug column name in the column list and add an appropriate slug string in
the corresponding VALUES tuple (e.g. a URL-safe or derived slug like
"compaction-test" or a deterministic slug from the title) so the row inserts
successfully.
In `@test/sanity/lib/cleanup.sh`:
- Around line 4-10: Source the cleanup helper and register its functions as a
trap in the runner/entrypoint so the compose stack and temp outputs are removed
on early interrupt: add a one-line "source" of the cleanup library at the top of
the entrypoint script (so cleanup_sanity_outputs and cleanup_docker are
available), then install a trap (e.g., trap 'cleanup_docker;
cleanup_sanity_outputs' EXIT INT TERM) before invoking docker compose up; ensure
the trap runs before starting the compose stack so interrupts always call
cleanup_docker and cleanup_sanity_outputs.
In `@test/sanity/phases/resilience.sh`:
- Around line 76-78: The test uses a literal true in assert_exit_0 so it always
passes; instead capture the real exit status of the altimate_run_with_turns
invocation and pass that to assert_exit_0. Concretely, run
altimate_run_with_turns "compaction" 3 --continue "continue working", save its
exit code (e.g. into a variable like rc) and then call assert_exit_0 "compaction
did not crash" "$rc" (or otherwise assert that rc == 0) so the assertion
reflects the actual command result; update references to altimate_run_with_turns
and assert_exit_0 accordingly.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 91-109: The tests unconditionally write "PASS" after calling
altimate_run; change test_postgres and test_snowflake to check altimate_run's
result (capture its exit code or stdout) and only write "PASS" to
"$RESULTS_DIR/postgres" or "$RESULTS_DIR/snowflake" on success, otherwise write
"FAIL" (and optionally include error output) and return a non-zero code;
specifically, after calling altimate_run in test_postgres and test_snowflake,
inspect $? (or use if altimate_run ...; then ...) to decide between echo "PASS"
> "$RESULTS_DIR/..." and echo "FAIL" > "$RESULTS_DIR/..." so failures aren’t
masked.
- Around line 63-67: The test_configure_claude function currently writes "PASS"
unconditionally; change it to capture and check the result of the altimate_run
invocation (use its exit code or captured output from altimate_run) and only
write "PASS" to RESULTS_DIR/configure-claude when altimate_run succeeds,
otherwise write "FAIL" (and optionally include the captured stderr/stdout or
exit code for diagnostics); update references in test_configure_claude to use
the captured status from altimate_run to decide the file contents.
In `@test/sanity/phases/verify-install.sh`:
- Around line 18-19: The SKILL_COUNT assignment using ls | wc -l || echo "0" can
produce a multiline or duplicated result under pipefail; replace it with a
deterministic find-based count to reliably produce a single numeric value.
Update the SKILL_COUNT assignment (the variable set near the assert_ge call) to
use find on ~/.altimate/builtin to locate SKILL.md files (e.g. using
-mindepth/-maxdepth or appropriate path matching) and pipe that to wc -l
(redirecting stderr to /dev/null) so SKILL_COUNT is always a single integer
before calling assert_ge.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 67-70: The current upgrade test emits a no-op command via
emit_test "upgrade-needed" "echo TRIGGER_UPGRADE_TEST" which always exits 0 and
creates a false positive; replace that emitted command so it either (a) emits a
clear skip marker that exits non-zero with an explanatory message (so
run-pr-tests.sh will mark it skipped/failed) or (b) invokes the real upgrade
verification script (e.g., call the project's upgrade-check/run-upgrade-test
entrypoint) instead of echo; update the emit_test invocation for
"upgrade-needed" accordingly so run-pr-tests.sh's eval runs a meaningful check
(refer to the emit_test invocation, the "upgrade-needed" test name, and the echo
TRIGGER_UPGRADE_TEST string to locate the change).
In `@test/sanity/pr-tests/run-pr-tests.sh`:
- Around line 19-24: The post-increment arithmetic ((PASS_COUNT++)) and
((FAIL_COUNT++)) can return non-zero when the counter is 0 and, under set -e,
cause the script to exit early; replace those C-style increments with
assignment-based increments (use PASS_COUNT=$((PASS_COUNT+1)) and
FAIL_COUNT=$((FAIL_COUNT+1))) in the test loop where cmd and name are handled so
the increments always return 0 and allow report_results to run.
---
Nitpick comments:
In `@test/sanity/ci-local.sh`:
- Around line 52-72: Wrap the docker bring-up and test steps in a guarded
section that installs a trap to always run the existing cleanup command (docker
compose ... down --volumes --remove-orphans) on EXIT, so containers are removed
even if any run_step or tests fail; specifically, add a trap that calls the same
cleanup command before running run_step "Docker Services Up" and the subsequent
run_step test blocks (referencing the run_step invocation names and the docker
compose command from the diff). Replace the fixed sleep 10 with a retrying
health-check loop that polls container health or attempts TCP connects to
TEST_*_PORTs with a timeout/attempt limit (used before running the local and
docker E2E run_step tests) so the script waits deterministically for services
instead of sleeping.
In `@test/sanity/Dockerfile`:
- Around line 4-6: The apt-get install command in the Dockerfile RUN line should
include --no-install-recommends to avoid pulling recommended packages; update
the RUN instruction that executes "apt-get update && apt-get install -y git
python3 python3-pip python3-venv curl sqlite3" to add the
--no-install-recommends flag (preserving the existing cleanup "&& rm -rf
/var/lib/apt/lists/*") so the image size and attack surface are reduced.
In `@test/sanity/lib/assert.sh`:
- Around line 19-28: In assert_exit_nonzero, capture the command's exit code and
include it in the FAIL message for consistency: run the command (still
redirecting output to /dev/null 2>&1), store its exit status in a variable
(e.g., rc = $?), then test rc to decide PASS/FAIL and use rc in the failure
echo; refer to the assert_exit_nonzero function and the rc variable when making
the change.
- Around line 80-91: The assert_ge function currently silences numeric errors
via "2>/dev/null", which hides non-numeric input problems; update assert_ge to
validate that both "$actual" and "$expected" are valid integers (e.g., using a
regex like '^[+-]?[0-9]+$' with [[ ... =~ ... ]]) before performing the numeric
comparison, and if validation fails emit a clear error/FAIL message that
includes the offending values and increment FAIL_COUNT instead of hiding the
error; keep the numeric comparison logic (the -ge test) but remove the error
suppression so failures are explicit and the function name assert_ge is still
used.
In `@test/sanity/lib/parallel.sh`:
- Around line 4-5: In detect_parallelism(), don't declare and assign cores in
one local statement because the command substitution masks its exit status;
instead, declare local cores on its own, then perform the fallback command
substitution into cores and check the exit status (or use a conditional fallback
chain) so failures of nproc/sysctl are detectable and handled; update references
to the variable name cores accordingly and ensure the function returns a
non-zero status when all detection attempts fail.
- Around line 46-65: The run_parallel function currently batches commands and
waits for all pids in pids[] to finish before starting the next batch, which
leaves slots idle; change run_parallel to use slot-based concurrency: before
launching each command check running count against max_parallel and if at
capacity wait for any single job to finish (use wait -n to reap one child and
capture its exit status into results[], removing its pid from pids[]), falling
back to a small sleep+jobs -r loop if wait -n isn’t available; after dispatching
all commands, wait for remaining pids in pids[] and return success/failure based
on collected results[]. Ensure you update handling of results[] and remove the
batch counter variable if unused.
In `@test/sanity/phases/resilience.sh`:
- Around line 86-96: The test unsets ANTHROPIC_API_KEY and only restores it
later, which can leave the environment modified if the script exits early;
update the block around SAVED_KEY/ANTHROPIC_API_KEY so the original key is
always restored on exit by either executing the unset/restore inside a subshell
or installing a trap that captures the original value (SAVED_KEY) and resets
ANTHROPIC_API_KEY on EXIT; locate and modify the section that declares
SAVED_KEY, unsets ANTHROPIC_API_KEY, runs altimate, and restores the key to
ensure restoration happens even on unexpected exits.
- Line 18: The lone cd "$WORKDIR" lacks error handling and can silently fail;
update the script to check the exit status of cd (the cd "$WORKDIR" invocation)
and abort the script on failure — for example, immediately follow cd "$WORKDIR"
with a failing-exit branch that prints an explanatory error to stderr and exits
non‑zero (or use a helper die/err function if one exists) so subsequent commands
never run in the wrong directory.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 52-61: The test functions (e.g., test_discover_mcps) currently use
combined declaration+assignment like `local output=$(get_output ...)` which
masks get_output's exit status; change each to declare the variable first (e.g.,
`local output`) then run the assignment as a separate statement so you can
capture and check the command's exit code (inspect `$?` or use `if !
output=$(get_output ...) ; then ... fi`) and fail the test early when get_output
errors before grepping; apply the same pattern to the other affected functions
listed in the comment.
- Around line 212-233: Remove the unused dead variable `num` computed in the
loop (derived from `TESTS`/`TEST_NAMES`) and simplify the loop to only read
results from `RESULTS_DIR` for each `name` in `TEST_NAMES`; also fix shellcheck
warnings by not mixing declaration+assignment when capturing output from helper
functions (separate declaration and assignment where `get_output` is used) and
add explicit error handling for any `cd` calls used earlier (e.g., check `cd`
return or use pushd/popd) so failures are not masked; locate changes around the
loop that references TEST_NAMES/RESULTS_DIR and update related counters
PASS_COUNT/FAIL_COUNT/SKIP_COUNT as-is.
In `@test/sanity/pr-tests/generate.sh`:
- Line 9: Replace the bare redirection that truncates the file (" >
\"$MANIFEST\"") with an explicit no-op redirection to satisfy shellcheck SC2188;
e.g., use the null command prefix (colon) to truncate the file (: > "$MANIFEST")
so the truncation is explicit—update the occurrence that references the MANIFEST
variable in generate.sh.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 117018ed-b237-4d66-8866-5b2d4252d58c
📒 Files selected for processing (22)
package.jsontest/sanity/.env.exampletest/sanity/Dockerfiletest/sanity/Dockerfile.upgradetest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/fixtures/broken-config.jsontest/sanity/fixtures/compaction-session.sqltest/sanity/fixtures/old-config.jsontest/sanity/fixtures/test.sqltest/sanity/lib/altimate-run.shtest/sanity/lib/assert.shtest/sanity/lib/cleanup.shtest/sanity/lib/parallel.shtest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/phases/verify-upgrade.shtest/sanity/pr-tests/generate.shtest/sanity/pr-tests/run-pr-tests.shtest/sanity/results/baseline.jsontest/sanity/run.sh
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (11)
test/sanity/lib/parallel.sh (2)
36-38: Redundant cap check.The tiered logic above already caps
parallelat 6 (line 26), so this condition can never be true. It's harmless defensive code, but could be removed for clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 36 - 38, Remove the redundant defensive cap that reassigns the variable "parallel" to 6 (the if block checking [ "$parallel" -gt 6 ] and setting parallel=6); the tiered logic above already enforces this cap, so delete that conditional block to simplify the script while leaving all earlier logic that sets "parallel" unchanged.
5-5: Consider separating declaration and assignment to preserve exit codes.The static analyzer flags that combining
localdeclaration with command substitution masks the return value. While the fallback|| echo "2"mitigates this fornproc/sysctl, separating them is safer practice.♻️ Proposed fix
detect_parallelism() { - local cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2") + local cores + cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` at line 5, Split the declaration and assignment for the variable cores to avoid masking exit codes: declare local cores first (use the existing variable name "cores"), then assign it with the command substitution that tries nproc, falls back to sysctl -n hw.ncpu, and finally echoes "2"; update the line containing the command substitution so it becomes an assignment to the already-declared local variable, preserving the original fallback behavior.test/sanity/pr-tests/generate.sh (2)
9-9: Prefer explicit no-op for file truncation.While
> "$MANIFEST"works in Bash, it's flagged by shellcheck. Using: > "$MANIFEST"ortrue > "$MANIFEST"is more portable and explicit.♻️ Proposed fix
-> "$MANIFEST" +: > "$MANIFEST"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` at line 9, Replace the bare redirection that truncates the manifest file by using an explicit no-op command before the redirection: change the line that currently uses > "$MANIFEST" to use a no-op redirection like : > "$MANIFEST" (or true > "$MANIFEST") so the truncation is explicit and shellcheck-friendly; update the same occurrence that references the MANIFEST variable in generate.sh.
57-65: Potential duplicate resilience test registration.If a PR changes both
session/andconfig/, two identicalresilience.shentries are added to the manifest, causing redundant execution. Consider deduplicating or combining conditions.♻️ Combined condition
-# session/compaction/storage changed → resilience -if echo "$changed" | grep -qE "session/|compaction|storage/"; then - emit_test "resilience" "$SCRIPT_DIR/phases/resilience.sh" -fi - -# config changed → backwards compat -if echo "$changed" | grep -q "config/"; then - emit_test "config-compat" "$SCRIPT_DIR/phases/resilience.sh" -fi +# session/compaction/storage/config changed → resilience +if echo "$changed" | grep -qE "session/|compaction|storage/|config/"; then + emit_test "resilience" "$SCRIPT_DIR/phases/resilience.sh" +fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` around lines 57 - 65, The script can register the same resilience test twice because both the first grep and the config check call emit_test with "$SCRIPT_DIR/phases/resilience.sh"; to fix, ensure emit_test for resilience.sh is only called once by adding a guard: introduce a local flag (e.g., resilience_emitted) that's set when emit_test "resilience" "$SCRIPT_DIR/phases/resilience.sh" is called in the session/compaction/storage branch and check that flag before calling emit_test again in the config/ branch (or alternatively merge the two grep conditions into one grep -qE that includes "config/"); update any references to the changed variable, emit_test invocation, and resilience.sh to implement the single-emission guard.test/sanity/phases/smoke-tests.sh (2)
214-214: Unused variablenum.This line calculates
numwhich is always 1 (sinceTESTSandTEST_NAMEShave equal length), but it's never used in the subsequentcaseblock.🧹 Remove dead code
result=$(cat "$RESULTS_DIR/$name" 2>/dev/null || echo "MISSING") - num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1)) case "$result" in🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` at line 214, The variable num computed on the line "num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1))" is unused and always 1; remove that dead code by deleting the num assignment and any related unused references, leaving the case block to operate without it (look for the TESTS and TEST_NAMES arrays and the case block in smoke-tests.sh to verify nothing else depends on num).
186-207: Consider usingrun_parallelfrom parallel.sh.This manual batch loop duplicates logic from
run_parallel. You could simplify by generating command strings for each test function and usingrun_parallel "$MAX_PARALLEL" "${cmds[@]}".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 186 - 207, The manual batching loop using idx, PIDS and wait duplicates existing logic; replace it by building an array of command strings from the TESTS array (e.g., for each element of TESTS push "${TESTS[$i]}" into cmds) and call the existing helper run_parallel with the concurrency limit (run_parallel "$MAX_PARALLEL" "${cmds[@]}") instead of the idx/PIDS loop; ensure you remove the manual batch loop and rely on run_parallel to manage background jobs and waiting.test/sanity/Dockerfile (1)
4-6: Add--no-install-recommendsto reduce image size and attack surface.Same as Dockerfile.upgrade—installing recommended packages is unnecessary for test containers.
🐳 Proposed fix
-RUN apt-get update && apt-get install -y \ +RUN apt-get update && apt-get install -y --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 4 - 6, Update the Dockerfile RUN line that installs packages (the apt-get update && apt-get install -y \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/* command) to include --no-install-recommends in the apt-get install invocation so apt-get install uses apt-get install -y --no-install-recommends ... thereby reducing image size and surface area.test/sanity/Dockerfile.upgrade (1)
3-5: Add--no-install-recommendsto reduce image size and attack surface.Installing recommended packages increases the image footprint unnecessarily for a test container.
🐳 Proposed fix
-RUN apt-get update && apt-get install -y \ +RUN apt-get update && apt-get install -y --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile.upgrade` around lines 3 - 5, Update the Dockerfile RUN line to prevent installing recommended packages: modify the apt-get install invocation in the RUN command (the line starting with "apt-get update && apt-get install -y") to include the --no-install-recommends flag so only essential packages (git, python3, python3-pip, python3-venv, curl, sqlite3) are installed and the image size/attack surface is reduced.test/sanity/phases/resilience.sh (2)
98-113: Config file cleanup may be skipped on early exit.If an error occurs between copying
old-config.json(line 103) and the cleanup (line 110), the config file remains. Consider using atrapfor reliable cleanup.♻️ Proposed fix using trap
# 7. Config backwards compatibility echo " [7/7] Config backwards compat..." CONFIG_DIR="${XDG_CONFIG_HOME:-$HOME/.config}/altimate-code" mkdir -p "$CONFIG_DIR" +CONFIG_CLEANUP="" if [ -f "$SCRIPT_DIR/fixtures/old-config.json" ]; then cp "$SCRIPT_DIR/fixtures/old-config.json" "$CONFIG_DIR/opencode.json" + CONFIG_CLEANUP="$CONFIG_DIR/opencode.json" + trap 'rm -f "$CONFIG_CLEANUP"' EXIT if [ -n "${ANTHROPIC_API_KEY:-}" ]; then altimate_run "old-config" "say hello" || true assert_not_contains "$(get_output old-config)" "parse error" "old config loads without parse error" else skip_test "Config backwards compat" "no ANTHROPIC_API_KEY" fi - rm -f "$CONFIG_DIR/opencode.json" + rm -f "$CONFIG_CLEANUP" + trap - EXIT🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 98 - 113, The test can leave CONFIG_DIR/opencode.json behind if the script exits early; after copying "$SCRIPT_DIR/fixtures/old-config.json" to "$CONFIG_DIR/opencode.json" (the cp call that creates the artifact used by altimate_run and get_output), install a trap that always removes "$CONFIG_DIR/opencode.json" (and/or unsets it) on EXIT or ERR so cleanup runs even on failures, set the trap immediately after the cp and before calling altimate_run/get_output, and optionally clear the trap after the explicit rm -f to avoid double actions.
88-96: Re-export may differ from original state whenANTHROPIC_API_KEYwas unset.If
ANTHROPIC_API_KEYwas originally unset (not just empty),SAVED_KEYcaptures an empty string and line 95 exports an empty variable rather than leaving it unset. This is likely fine for this test context but worth noting.♻️ More precise state restoration
# 6. Graceful on missing provider key echo " [6/7] Missing API key handling..." -SAVED_KEY="${ANTHROPIC_API_KEY:-}" +if [ -n "${ANTHROPIC_API_KEY+x}" ]; then + SAVED_KEY="$ANTHROPIC_API_KEY" + HAD_KEY=1 +else + HAD_KEY=0 +fi unset ANTHROPIC_API_KEY OUTPUT=$(timeout 10 altimate run --max-turns 1 --yolo "hello" 2>&1 || true) # Should get a clean error, not an unhandled exception / stack trace assert_not_contains "$OUTPUT" "TypeError" "no TypeError on missing key" assert_not_contains "$OUTPUT" "Cannot read properties" "no unhandled error on missing key" -if [ -n "$SAVED_KEY" ]; then +if [ "$HAD_KEY" = "1" ]; then export ANTHROPIC_API_KEY="$SAVED_KEY" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 88 - 96, The test currently uses SAVED_KEY="${ANTHROPIC_API_KEY:-}" which loses the distinction between an originally unset and an empty ANTHROPIC_API_KEY; update the save/restore logic to track whether ANTHROPIC_API_KEY was set (e.g. save a presence flag like SAVED_KEY_SET or use parameter-expansion check ${ANTHROPIC_API_KEY+x}) before unsetting, then only re-export ANTHROPIC_API_KEY if the flag indicates it was originally set so you restore the previous state exactly; adjust references around SAVED_KEY, ANTHROPIC_API_KEY and the final export in the resilience.sh snippet accordingly.test/sanity/run.sh (1)
41-43: Separate declaration and assignment to avoid maskingcatfailures.If
OLD_VERSION_FILEexists butcatfails (permissions, I/O error), the failure is masked andOLD_VERSIONgets an empty string.♻️ Proposed fix
if [ -f "${OLD_VERSION_FILE:-}" ]; then - export OLD_VERSION=$(cat "$OLD_VERSION_FILE") + OLD_VERSION=$(cat "$OLD_VERSION_FILE") + export OLD_VERSION fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/run.sh` around lines 41 - 43, Separate declaration from the assignment and explicitly check the read/cat exit status so a failing cat doesn't get masked; instead of doing a blind export+command substitution, first export or initialize OLD_VERSION, then attempt to read the file (using read -r OLD_VERSION < "$OLD_VERSION_FILE" or run cat "$OLD_VERSION_FILE") and if that command fails, log or exit with an error; reference the OLD_VERSION_FILE/OLD_VERSION variables and ensure the script treats a failing read/cat as an error rather than quietly leaving OLD_VERSION empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/Dockerfile`:
- Around line 27-30: The Dockerfile currently overwrites the existing
package.json with only dependencies, which loses metadata used by
postinstall.mjs; change the RUN step that writes package.json to instead merge
the new dependency into the existing
/home/testuser/.altimate-install/package.json (e.g., use jq or a short node
script to read the existing package.json, set
dependencies["@altimateai/altimate-core"]="latest" while preserving other
fields, write back the merged JSON, then run bun install), ensuring
postinstall.mjs can still read the original metadata.
In `@test/sanity/phases/resilience.sh`:
- Line 18: The script currently runs cd "$WORKDIR" without checking success;
update the step that changes to WORKDIR so the failure is detected and handled
(e.g., test the exit status of the cd command and on failure print an error
mentioning WORKDIR and exit with non‑zero status). Locate the cd "$WORKDIR"
invocation in the resilience.sh flow and replace it with a guarded form that
aborts the script when cd fails to prevent running subsequent tests in the wrong
directory.
In `@test/sanity/phases/smoke-tests.sh`:
- Line 26: The cd "$WORKDIR" call (and every other cd in the test functions)
must check for failure and abort rather than continuing silently; change each cd
invocation to detect a non‑zero exit and immediately log an error and exit (for
example, test the return value and call exit 1 or return non‑zero from the
function) so the script stops if the directory change fails and does not
continue executing in the wrong location.
---
Nitpick comments:
In `@test/sanity/Dockerfile`:
- Around line 4-6: Update the Dockerfile RUN line that installs packages (the
apt-get update && apt-get install -y \ git python3 python3-pip python3-venv curl
sqlite3 \ && rm -rf /var/lib/apt/lists/* command) to include
--no-install-recommends in the apt-get install invocation so apt-get install
uses apt-get install -y --no-install-recommends ... thereby reducing image size
and surface area.
In `@test/sanity/Dockerfile.upgrade`:
- Around line 3-5: Update the Dockerfile RUN line to prevent installing
recommended packages: modify the apt-get install invocation in the RUN command
(the line starting with "apt-get update && apt-get install -y") to include the
--no-install-recommends flag so only essential packages (git, python3,
python3-pip, python3-venv, curl, sqlite3) are installed and the image
size/attack surface is reduced.
In `@test/sanity/lib/parallel.sh`:
- Around line 36-38: Remove the redundant defensive cap that reassigns the
variable "parallel" to 6 (the if block checking [ "$parallel" -gt 6 ] and
setting parallel=6); the tiered logic above already enforces this cap, so delete
that conditional block to simplify the script while leaving all earlier logic
that sets "parallel" unchanged.
- Line 5: Split the declaration and assignment for the variable cores to avoid
masking exit codes: declare local cores first (use the existing variable name
"cores"), then assign it with the command substitution that tries nproc, falls
back to sysctl -n hw.ncpu, and finally echoes "2"; update the line containing
the command substitution so it becomes an assignment to the already-declared
local variable, preserving the original fallback behavior.
In `@test/sanity/phases/resilience.sh`:
- Around line 98-113: The test can leave CONFIG_DIR/opencode.json behind if the
script exits early; after copying "$SCRIPT_DIR/fixtures/old-config.json" to
"$CONFIG_DIR/opencode.json" (the cp call that creates the artifact used by
altimate_run and get_output), install a trap that always removes
"$CONFIG_DIR/opencode.json" (and/or unsets it) on EXIT or ERR so cleanup runs
even on failures, set the trap immediately after the cp and before calling
altimate_run/get_output, and optionally clear the trap after the explicit rm -f
to avoid double actions.
- Around line 88-96: The test currently uses SAVED_KEY="${ANTHROPIC_API_KEY:-}"
which loses the distinction between an originally unset and an empty
ANTHROPIC_API_KEY; update the save/restore logic to track whether
ANTHROPIC_API_KEY was set (e.g. save a presence flag like SAVED_KEY_SET or use
parameter-expansion check ${ANTHROPIC_API_KEY+x}) before unsetting, then only
re-export ANTHROPIC_API_KEY if the flag indicates it was originally set so you
restore the previous state exactly; adjust references around SAVED_KEY,
ANTHROPIC_API_KEY and the final export in the resilience.sh snippet accordingly.
In `@test/sanity/phases/smoke-tests.sh`:
- Line 214: The variable num computed on the line "num=$((${`#TESTS`[@]} -
${`#TEST_NAMES`[@]} + 1))" is unused and always 1; remove that dead code by
deleting the num assignment and any related unused references, leaving the case
block to operate without it (look for the TESTS and TEST_NAMES arrays and the
case block in smoke-tests.sh to verify nothing else depends on num).
- Around line 186-207: The manual batching loop using idx, PIDS and wait
duplicates existing logic; replace it by building an array of command strings
from the TESTS array (e.g., for each element of TESTS push "${TESTS[$i]}" into
cmds) and call the existing helper run_parallel with the concurrency limit
(run_parallel "$MAX_PARALLEL" "${cmds[@]}") instead of the idx/PIDS loop; ensure
you remove the manual batch loop and rely on run_parallel to manage background
jobs and waiting.
In `@test/sanity/pr-tests/generate.sh`:
- Line 9: Replace the bare redirection that truncates the manifest file by using
an explicit no-op command before the redirection: change the line that currently
uses > "$MANIFEST" to use a no-op redirection like : > "$MANIFEST" (or true >
"$MANIFEST") so the truncation is explicit and shellcheck-friendly; update the
same occurrence that references the MANIFEST variable in generate.sh.
- Around line 57-65: The script can register the same resilience test twice
because both the first grep and the config check call emit_test with
"$SCRIPT_DIR/phases/resilience.sh"; to fix, ensure emit_test for resilience.sh
is only called once by adding a guard: introduce a local flag (e.g.,
resilience_emitted) that's set when emit_test "resilience"
"$SCRIPT_DIR/phases/resilience.sh" is called in the session/compaction/storage
branch and check that flag before calling emit_test again in the config/ branch
(or alternatively merge the two grep conditions into one grep -qE that includes
"config/"); update any references to the changed variable, emit_test invocation,
and resilience.sh to implement the single-emission guard.
In `@test/sanity/run.sh`:
- Around line 41-43: Separate declaration from the assignment and explicitly
check the read/cat exit status so a failing cat doesn't get masked; instead of
doing a blind export+command substitution, first export or initialize
OLD_VERSION, then attempt to read the file (using read -r OLD_VERSION <
"$OLD_VERSION_FILE" or run cat "$OLD_VERSION_FILE") and if that command fails,
log or exit with an error; reference the OLD_VERSION_FILE/OLD_VERSION variables
and ensure the script treats a failing read/cat as an error rather than quietly
leaving OLD_VERSION empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2f77a047-9947-47fc-a80a-082e0786e0d8
📒 Files selected for processing (22)
package.jsontest/sanity/.env.exampletest/sanity/Dockerfiletest/sanity/Dockerfile.upgradetest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/fixtures/broken-config.jsontest/sanity/fixtures/compaction-session.sqltest/sanity/fixtures/old-config.jsontest/sanity/fixtures/test.sqltest/sanity/lib/altimate-run.shtest/sanity/lib/assert.shtest/sanity/lib/cleanup.shtest/sanity/lib/parallel.shtest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/phases/verify-upgrade.shtest/sanity/pr-tests/generate.shtest/sanity/pr-tests/run-pr-tests.shtest/sanity/results/baseline.jsontest/sanity/run.sh
✅ Files skipped from review due to trivial changes (9)
- test/sanity/fixtures/old-config.json
- test/sanity/fixtures/test.sql
- test/sanity/.env.example
- test/sanity/fixtures/compaction-session.sql
- test/sanity/results/baseline.json
- test/sanity/lib/cleanup.sh
- test/sanity/phases/verify-install.sh
- test/sanity/lib/altimate-run.sh
- test/sanity/lib/assert.sh
🚧 Files skipped from review as they are similar to previous changes (6)
- test/sanity/fixtures/broken-config.json
- test/sanity/docker-compose.yml
- test/sanity/pr-tests/run-pr-tests.sh
- package.json
- test/sanity/phases/verify-upgrade.sh
- test/sanity/ci-local.sh
09ee72c to
a1bcfff
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
test/sanity/Dockerfile (2)
32-35:⚠️ Potential issue | 🔴 CriticalHardcoded architecture will fail on x86_64 hosts.
Line 34 hardcodes
altimate-code-linux-arm64, which will fail when building or running on x86_64 machines. Use dynamic architecture detection to support both arm64 and x86_64.Proposed fix using Docker's TARGETARCH
+ARG TARGETARCH # Link binary to PATH and copy skills to ~/.altimate/builtin/ RUN mkdir -p /home/testuser/.local/bin && \ - cp /home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-arm64/bin/altimate /home/testuser/.local/bin/altimate && \ + ARCH=$([ "$TARGETARCH" = "amd64" ] && echo "x64" || echo "arm64") && \ + cp /home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-${ARCH}/bin/altimate /home/testuser/.local/bin/altimate && \ chmod +x /home/testuser/.local/bin/altimate && \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 32 - 35, The Dockerfile copies a hardcoded arch-specific binary path (/home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-arm64/bin/altimate), which will fail on x86_64; change the RUN step that copies into /home/testuser/.local/bin (and the referenced altimate binary name) to use Docker build-time architecture detection (e.g., TARGETARCH or similar) to select the correct dist folder dynamically so both arm64 and amd64 are supported; update the COPY/CP source path construction to reference the chosen architecture variable and keep the chmod +x and mkdir logic intact for /home/testuser/.local/bin.
27-30:⚠️ Potential issue | 🟡 MinorOverwrites package.json, losing original metadata.
Line 29 replaces the package.json copied at line 24 with a minimal stub. If
postinstall.mjsreads version info or other metadata from package.json, this could cause issues.Alternative: merge dependencies with jq
# Install altimate-core native binding (required at runtime) RUN cd /home/testuser/.altimate-install && \ - echo '{"dependencies":{"@altimateai/altimate-core":"latest"}}' > package.json && \ + jq '. + {"dependencies": (.dependencies // {} | . + {"@altimateai/altimate-core": "latest"})}' package.json > package.json.tmp && \ + mv package.json.tmp package.json && \ bun install 2>&1 | tail -5This requires adding
jqto the apt-get install list.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 27 - 30, The Dockerfile RUN step overwrites the existing package.json at /home/testuser/.altimate-install (the echo '{"dependencies":...}' > package.json) which can strip metadata used by postinstall.mjs; instead modify that RUN to merge or update only the dependencies field—e.g., read the existing package.json and merge in "@altimateai/altimate-core":"latest" (using jq, a short Node one-liner, or npm/yarn package set) so you preserve version, name, and other metadata; target the RUN that performs cd /home/testuser/.altimate-install and replace the overwrite with a merge/update operation on package.json.test/sanity/phases/smoke-tests.sh (1)
26-26:⚠️ Potential issue | 🟡 MinorAdd error handling for
cdcommand.If
mktempsucceeds butcdfails (e.g., permission issues), the script continues in an unexpected directory, causing git init and subsequent operations to run in the wrong location.🔧 Proposed fix
WORKDIR=$(mktemp -d /tmp/sanity-workdir-XXXXXX) -cd "$WORKDIR" +cd "$WORKDIR" || { echo "FAIL: cannot cd to $WORKDIR"; exit 1; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` at line 26, Add explicit error handling after the cd "$WORKDIR" command: verify that the chdir succeeded and abort the script (with an explanatory error message) if it failed, so that a successful mktemp won't be followed by running git init and other operations in the wrong directory; update the script where cd "$WORKDIR" appears to check its exit status and exit non‑zero on failure (or use an equivalent guard like enabling strict mode) so subsequent commands never run in the wrong location.test/sanity/phases/resilience.sh (1)
18-18:⚠️ Potential issue | 🟡 MinorAdd error handling for
cdcommand.If
cd "$WORKDIR"fails, subsequent git commands and tests run in an unexpected directory, causing misleading failures.🔧 Proposed fix
WORKDIR=$(mktemp -d /tmp/sanity-resilience-XXXXXX) -cd "$WORKDIR" +cd "$WORKDIR" || { echo "FAIL: cannot cd to $WORKDIR"; exit 1; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` at line 18, Ensure the script checks and handles failure of the directory change: after the cd "$WORKDIR" command validate its exit status and abort with a clear error if it fails (e.g., replace or augment cd "$WORKDIR" with a guarded form that prints an error and exits non‑zero). Update the resilience.sh logic around the cd "$WORKDIR" call so subsequent git commands and tests only run when the working directory was successfully changed.
🧹 Nitpick comments (9)
test/sanity/lib/assert.sh (1)
8-17: The$?in the FAIL message may not reflect the command's exit code.In
assert_exit_0, the$?on line 14 captures the exit status of theifcondition evaluation, not the original command's exit code. Afterif "$@" >/dev/null 2>&1; thenevaluates to false,$?is set to the result of the if-test (which is the command's exit code), but this can be unreliable in some shells.Suggested fix: capture exit code explicitly
assert_exit_0() { local desc="$1"; shift - if "$@" >/dev/null 2>&1; then + local rc=0 + "$@" >/dev/null 2>&1 || rc=$? + if [ "$rc" -eq 0 ]; then echo " PASS: $desc" PASS_COUNT=$((PASS_COUNT + 1)) else - echo " FAIL: $desc (exit code $?)" + echo " FAIL: $desc (exit code $rc)" FAIL_COUNT=$((FAIL_COUNT + 1)) fi }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/assert.sh` around lines 8 - 17, The FAIL message in assert_exit_0 can show an incorrect exit code because `$?` is not captured immediately; modify assert_exit_0 to capture the command's exit status into a local variable (e.g., rc) right after running the command (run the command with "$@" >/dev/null 2>&1; rc=$?) and then use that rc in the FAIL echo and any logic that increments FAIL_COUNT, ensuring you reference the function name assert_exit_0 and the captured variable (rc) when updating output.test/sanity/Dockerfile (1)
4-6: Consider adding--no-install-recommendsto reduce image size.Suggested fix
-RUN apt-get update && apt-get install -y \ +RUN apt-get update && apt-get install -y --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 4 - 6, Update the Dockerfile RUN instruction that calls apt-get install so it uses apt-get install --no-install-recommends -y (i.e., change the RUN line installing git python3 python3-pip python3-venv curl sqlite3 to include --no-install-recommends) while keeping the apt-get update and the trailing rm -rf /var/lib/apt/lists/* intact; locate the RUN command in the Dockerfile containing the apt-get update && apt-get install -y sequence and insert --no-install-recommends into the install command.test/sanity/lib/parallel.sh (1)
4-20: Consider separating declaration and assignment for SC2155 compliance.While the current code works due to the
|| fallbackpattern, separating declaration and assignment is a best practice to avoid masking return values in edge cases.Suggested fix for lines 5 and 10
detect_parallelism() { - local cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2") + local cores + cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2") # Try multiple methods for RAM detection local ram_gb="" # Linux: free - ram_gb=$(free -g 2>/dev/null | awk '/Mem:/{print $2}') + ram_gb=$(free -g 2>/dev/null | awk '/Mem:/{print $2}') || true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 4 - 20, The function detect_parallelism currently declares and assigns variables inline (e.g., local cores=$(nproc ... ) and local ram_gb="") which can trigger shellcheck SC2155; change to separate declaration and assignment: declare local cores and later assign cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2"), and declare local ram_gb then assign to ram_gb in the subsequent detection steps (free, /proc/meminfo awk, sysctl) while preserving the existing fallback logic and final ram_gb="${ram_gb:-8}"; update references inside detect_parallelism to use these separated declarations to avoid masking return values.test/sanity/Dockerfile.upgrade (1)
3-5: Consider adding--no-install-recommendsto reduce image size.This is a minor optimization that avoids pulling in unnecessary recommended packages.
Suggested fix
-RUN apt-get update && apt-get install -y \ +RUN apt-get update && apt-get install -y --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile.upgrade` around lines 3 - 5, The apt-get install invocation in the RUN command (the line starting with "RUN apt-get update && apt-get install -y \ git python3 python3-pip python3-venv curl sqlite3 \") should include the --no-install-recommends flag to avoid installing unnecessary recommended packages and reduce image size; update that RUN line to add --no-install-recommends immediately after apt-get install -y while keeping the existing package list and the cleanup of /var/lib/apt/lists/*.test/sanity/run.sh (1)
44-48: Separate declaration and assignment to avoid masking errors (SC2155).If
cat "$OLD_VERSION_FILE"fails, the error is masked by the combinedexport. This could lead to silent failures.Suggested fix
# Read old version from file set by Dockerfile.upgrade if [ -f "${OLD_VERSION_FILE:-}" ]; then - export OLD_VERSION=$(cat "$OLD_VERSION_FILE") + OLD_VERSION=$(cat "$OLD_VERSION_FILE") + export OLD_VERSION fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/run.sh` around lines 44 - 48, Separate declaration and assignment for OLD_VERSION to avoid masking errors: first assign OLD_VERSION with OLD_VERSION=$(cat "$OLD_VERSION_FILE") (checking for failures) then export OLD_VERSION in a separate statement; keep the conditional block that checks [ -f "${OLD_VERSION_FILE:-}" ] and leave the subsequent run_phase call unchanged (referencing OLD_VERSION_FILE, OLD_VERSION, and run_phase to locate the code).test/sanity/pr-tests/generate.sh (1)
9-9: Add a no-op command for the redirection.ShellCheck SC2188: A bare redirection
> "$MANIFEST"without a command can be confusing. Use: >(colon is a no-op builtin) for clarity.Suggested fix
-> "$MANIFEST" +: > "$MANIFEST"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` at line 9, Replace the bare redirection line containing "> \"$MANIFEST\"" with a no-op builtin redirection to make the intent explicit; specifically, use the colon no-op (:) before the redirection so the shell performs the truncate/create without relying on an empty command. Update the line that references "$MANIFEST" in generate.sh accordingly.test/sanity/phases/smoke-tests.sh (3)
53-53: Multiplecdcommands lack error handling.Several test functions use
cd "$WORKDIR"without error handling. While tests run in subshells (limiting blast radius), a failingcdcould still produce misleading results.♻️ Proposed fix pattern
test_discover_mcps() { - cd "$WORKDIR" + cd "$WORKDIR" || return 1 altimate_run "discover-mcps" --command discover-and-add-mcps "list"Apply the same
|| return 1pattern totest_sql_analyze,test_duckdb,test_builder,test_analyst,test_bad_command, andtest_discover.Also applies to: 75-75, 86-86, 127-127, 138-138, 149-149, 160-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` at line 53, Several test functions call cd "$WORKDIR" without checking for failure; update each such invocation in test_sql_analyze, test_duckdb, test_builder, test_analyst, test_bad_command, and test_discover to append a failure guard (e.g., "|| return 1") so the function exits immediately if the cd fails, ensuring subsequent test steps don't run in the wrong directory.
227-248: Unused variablenumin results loop.Line 229 computes
numbut it's never used. This appears to be leftover from an earlier iteration.♻️ Remove unused variable
for name in "${TEST_NAMES[@]}"; do result=$(cat "$RESULTS_DIR/$name" 2>/dev/null || echo "MISSING") - num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1)) case "$result" in🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 227 - 248, Remove the unused variable assignment "num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1))" from the results loop in the smoke-tests.sh script; locate the for loop that iterates over TEST_NAMES and the case on result (references: TEST_NAMES, RESULTS_DIR, PASS_COUNT, FAIL_COUNT, SKIP_COUNT) and delete that num calculation line so there are no unused variables left in the loop.
36-49: Dead code:run_smoke_testfunction is never called.This wrapper function is defined but the test execution loop at lines 211-214 directly invokes the test functions (
${TESTS[$i]} &) without usingrun_smoke_test.Either remove the unused function or integrate it into the test execution:
♻️ Option 1: Remove unused function
-run_smoke_test() { - local num="$1" - local name="$2" - local result_file="$RESULTS_DIR/$name" - shift 2 - - echo " [$num] $name..." - - # Run the test - local exit_code=0 - "$@" || exit_code=$? - - echo "$exit_code" > "$result_file" -} - # Define all test functions🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 36 - 49, The run_smoke_test wrapper is dead code; either delete the unused function or update the test-run loop to call it. To integrate, change the loop that currently invokes tests via "${TESTS[$i]} &" to call run_smoke_test with the index, test name, and command: run_smoke_test "$i" "${TESTS[$i]}" "${TESTS_CMD[$i]}" & (or expand the test command/args so run_smoke_test receives the command and args), ensuring RESULTS_DIR is used for result_file and that run_smoke_test's shift behavior consumes the first two params (num and name) before executing the command; alternatively simply remove the run_smoke_test function if you prefer the direct background invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@package.json`:
- Line 22: The package.json script "sanity:upgrade" references a non-existent
override file test/sanity/docker-compose.upgrade.yml which will cause the script
to fail; fix by either adding the missing override file (create
test/sanity/docker-compose.upgrade.yml with the expected service overrides for
the upgrade scenario) or update the "sanity:upgrade" script entry to remove or
correct the reference to docker-compose.upgrade.yml so it only uses existing
files (adjust the script name "sanity:upgrade" in package.json accordingly).
In `@test/sanity/phases/resilience.sh`:
- Line 78: The line uses the shell-only keyword `local` outside a function,
causing a syntax error; replace the top-level declaration `local
comp_output=$(get_output "compaction")` with a plain assignment
`comp_output=$(get_output "compaction")` (or move that assignment inside a
function if you intended function-local scope), referencing the get_output call
and the comp_output variable so the brittle `local` usage is removed.
---
Duplicate comments:
In `@test/sanity/Dockerfile`:
- Around line 32-35: The Dockerfile copies a hardcoded arch-specific binary path
(/home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-arm64/bin/altimate),
which will fail on x86_64; change the RUN step that copies into
/home/testuser/.local/bin (and the referenced altimate binary name) to use
Docker build-time architecture detection (e.g., TARGETARCH or similar) to select
the correct dist folder dynamically so both arm64 and amd64 are supported;
update the COPY/CP source path construction to reference the chosen architecture
variable and keep the chmod +x and mkdir logic intact for
/home/testuser/.local/bin.
- Around line 27-30: The Dockerfile RUN step overwrites the existing
package.json at /home/testuser/.altimate-install (the echo
'{"dependencies":...}' > package.json) which can strip metadata used by
postinstall.mjs; instead modify that RUN to merge or update only the
dependencies field—e.g., read the existing package.json and merge in
"@altimateai/altimate-core":"latest" (using jq, a short Node one-liner, or
npm/yarn package set) so you preserve version, name, and other metadata; target
the RUN that performs cd /home/testuser/.altimate-install and replace the
overwrite with a merge/update operation on package.json.
In `@test/sanity/phases/resilience.sh`:
- Line 18: Ensure the script checks and handles failure of the directory change:
after the cd "$WORKDIR" command validate its exit status and abort with a clear
error if it fails (e.g., replace or augment cd "$WORKDIR" with a guarded form
that prints an error and exits non‑zero). Update the resilience.sh logic around
the cd "$WORKDIR" call so subsequent git commands and tests only run when the
working directory was successfully changed.
In `@test/sanity/phases/smoke-tests.sh`:
- Line 26: Add explicit error handling after the cd "$WORKDIR" command: verify
that the chdir succeeded and abort the script (with an explanatory error
message) if it failed, so that a successful mktemp won't be followed by running
git init and other operations in the wrong directory; update the script where cd
"$WORKDIR" appears to check its exit status and exit non‑zero on failure (or use
an equivalent guard like enabling strict mode) so subsequent commands never run
in the wrong location.
---
Nitpick comments:
In `@test/sanity/Dockerfile`:
- Around line 4-6: Update the Dockerfile RUN instruction that calls apt-get
install so it uses apt-get install --no-install-recommends -y (i.e., change the
RUN line installing git python3 python3-pip python3-venv curl sqlite3 to include
--no-install-recommends) while keeping the apt-get update and the trailing rm
-rf /var/lib/apt/lists/* intact; locate the RUN command in the Dockerfile
containing the apt-get update && apt-get install -y sequence and insert
--no-install-recommends into the install command.
In `@test/sanity/Dockerfile.upgrade`:
- Around line 3-5: The apt-get install invocation in the RUN command (the line
starting with "RUN apt-get update && apt-get install -y \ git python3
python3-pip python3-venv curl sqlite3 \") should include the
--no-install-recommends flag to avoid installing unnecessary recommended
packages and reduce image size; update that RUN line to add
--no-install-recommends immediately after apt-get install -y while keeping the
existing package list and the cleanup of /var/lib/apt/lists/*.
In `@test/sanity/lib/assert.sh`:
- Around line 8-17: The FAIL message in assert_exit_0 can show an incorrect exit
code because `$?` is not captured immediately; modify assert_exit_0 to capture
the command's exit status into a local variable (e.g., rc) right after running
the command (run the command with "$@" >/dev/null 2>&1; rc=$?) and then use that
rc in the FAIL echo and any logic that increments FAIL_COUNT, ensuring you
reference the function name assert_exit_0 and the captured variable (rc) when
updating output.
In `@test/sanity/lib/parallel.sh`:
- Around line 4-20: The function detect_parallelism currently declares and
assigns variables inline (e.g., local cores=$(nproc ... ) and local ram_gb="")
which can trigger shellcheck SC2155; change to separate declaration and
assignment: declare local cores and later assign cores=$(nproc 2>/dev/null ||
sysctl -n hw.ncpu 2>/dev/null || echo "2"), and declare local ram_gb then assign
to ram_gb in the subsequent detection steps (free, /proc/meminfo awk, sysctl)
while preserving the existing fallback logic and final ram_gb="${ram_gb:-8}";
update references inside detect_parallelism to use these separated declarations
to avoid masking return values.
In `@test/sanity/phases/smoke-tests.sh`:
- Line 53: Several test functions call cd "$WORKDIR" without checking for
failure; update each such invocation in test_sql_analyze, test_duckdb,
test_builder, test_analyst, test_bad_command, and test_discover to append a
failure guard (e.g., "|| return 1") so the function exits immediately if the cd
fails, ensuring subsequent test steps don't run in the wrong directory.
- Around line 227-248: Remove the unused variable assignment
"num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1))" from the results loop in the
smoke-tests.sh script; locate the for loop that iterates over TEST_NAMES and the
case on result (references: TEST_NAMES, RESULTS_DIR, PASS_COUNT, FAIL_COUNT,
SKIP_COUNT) and delete that num calculation line so there are no unused
variables left in the loop.
- Around line 36-49: The run_smoke_test wrapper is dead code; either delete the
unused function or update the test-run loop to call it. To integrate, change the
loop that currently invokes tests via "${TESTS[$i]} &" to call run_smoke_test
with the index, test name, and command: run_smoke_test "$i" "${TESTS[$i]}"
"${TESTS_CMD[$i]}" & (or expand the test command/args so run_smoke_test receives
the command and args), ensuring RESULTS_DIR is used for result_file and that
run_smoke_test's shift behavior consumes the first two params (num and name)
before executing the command; alternatively simply remove the run_smoke_test
function if you prefer the direct background invocation.
In `@test/sanity/pr-tests/generate.sh`:
- Line 9: Replace the bare redirection line containing "> \"$MANIFEST\"" with a
no-op builtin redirection to make the intent explicit; specifically, use the
colon no-op (:) before the redirection so the shell performs the truncate/create
without relying on an empty command. Update the line that references "$MANIFEST"
in generate.sh accordingly.
In `@test/sanity/run.sh`:
- Around line 44-48: Separate declaration and assignment for OLD_VERSION to
avoid masking errors: first assign OLD_VERSION with OLD_VERSION=$(cat
"$OLD_VERSION_FILE") (checking for failures) then export OLD_VERSION in a
separate statement; keep the conditional block that checks [ -f
"${OLD_VERSION_FILE:-}" ] and leave the subsequent run_phase call unchanged
(referencing OLD_VERSION_FILE, OLD_VERSION, and run_phase to locate the code).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ea7f24e5-b1af-4282-ab4f-cf7311a2a766
📒 Files selected for processing (22)
package.jsontest/sanity/.env.exampletest/sanity/Dockerfiletest/sanity/Dockerfile.upgradetest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/fixtures/broken-config.jsontest/sanity/fixtures/compaction-session.sqltest/sanity/fixtures/old-config.jsontest/sanity/fixtures/test.sqltest/sanity/lib/altimate-run.shtest/sanity/lib/assert.shtest/sanity/lib/cleanup.shtest/sanity/lib/parallel.shtest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/phases/verify-upgrade.shtest/sanity/pr-tests/generate.shtest/sanity/pr-tests/run-pr-tests.shtest/sanity/results/baseline.jsontest/sanity/run.sh
✅ Files skipped from review due to trivial changes (9)
- test/sanity/fixtures/old-config.json
- test/sanity/.env.example
- test/sanity/results/baseline.json
- test/sanity/fixtures/broken-config.json
- test/sanity/lib/cleanup.sh
- test/sanity/fixtures/compaction-session.sql
- test/sanity/docker-compose.yml
- test/sanity/phases/verify-install.sh
- test/sanity/fixtures/test.sql
🚧 Files skipped from review as they are similar to previous changes (3)
- test/sanity/lib/altimate-run.sh
- test/sanity/ci-local.sh
- test/sanity/phases/verify-upgrade.sh
7700d48 to
c977474
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
package.json (1)
22-22:⚠️ Potential issue | 🟡 Minor
docker-compose.upgrade.ymlstill missing.This script references
test/sanity/docker-compose.upgrade.ymlwhich does not exist. The script will fail at runtime. Previously acknowledged as "planned" — consider either adding the file or removing/commenting out this script until it's ready.#!/bin/bash # Verify if docker-compose.upgrade.yml exists if [ -f "test/sanity/docker-compose.upgrade.yml" ]; then echo "✓ docker-compose.upgrade.yml exists" else echo "✗ docker-compose.upgrade.yml NOT FOUND" ls -la test/sanity/docker-compose*.yml 2>/dev/null || echo "No docker-compose files found" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@package.json` at line 22, The package.json npm script "sanity:upgrade" references a missing docker-compose file (docker-compose.upgrade.yml) which will cause failures; fix by either adding the missing docker-compose.upgrade.yml under the test/sanity test fixtures (ensure it contains the intended upgrade orchestration) or remove/comment/disable the "sanity:upgrade" script from package.json until that file and its contents are implemented; update or run the provided verification shell snippet to confirm the file exists if you add it.test/sanity/Dockerfile (1)
32-35:⚠️ Potential issue | 🟠 MajorHardcoded
arm64architecture will fail on x86_64 hosts.Line 34 still hardcodes
altimate-code-linux-arm64. This was flagged in a previous review and marked as "Fixed," but the code remains unchanged. The path will not exist on x86_64 machines (most CI runners and developer workstations).🔧 Proposed fix using Docker's TARGETARCH
# Link binary to PATH and copy skills to ~/.altimate/builtin/ +ARG TARGETARCH RUN mkdir -p /home/testuser/.local/bin && \ - cp /home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-arm64/bin/altimate /home/testuser/.local/bin/altimate && \ + ARCH=$([ "$TARGETARCH" = "amd64" ] && echo "x64" || echo "arm64") && \ + cp /home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-${ARCH}/bin/altimate /home/testuser/.local/bin/altimate && \ chmod +x /home/testuser/.local/bin/altimate && \ mkdir -p /home/testuser/.altimate/builtin && \ cp -r /home/testuser/.altimate-install/skills/* /home/testuser/.altimate/builtin/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 32 - 35, The Dockerfile still hardcodes the arm64 build directory (dist/@altimateai/altimate-code-linux-arm64) when linking the altimate binary; update the RUN step to use Docker's build-time TARGETARCH instead: add an ARG TARGETARCH (with optional default) at the top of the Dockerfile and replace the hardcoded segment in the cp source path with the variable (e.g., dist/@altimateai/altimate-code-linux-${TARGETARCH}) so the cp in the RUN line copies the correct arch-specific binary into /home/testuser/.local/bin and preserves chmod +x and mkdir -p steps.
🧹 Nitpick comments (8)
test/sanity/Dockerfile.upgrade (1)
3-5: Consider adding--no-install-recommendsfor consistency with best practices.Same suggestion as the main Dockerfile — reduces image size by excluding recommended packages.
🔧 Proposed fix
RUN apt-get update && apt-get install -y \ + --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile.upgrade` around lines 3 - 5, Update the Dockerfile RUN installation command that currently uses "apt-get install -y git python3 python3-pip python3-venv curl sqlite3" to include the --no-install-recommends flag so recommended packages are not pulled in; modify the RUN line (the one starting with "RUN apt-get update && apt-get install -y") to add --no-install-recommends immediately after "apt-get install" to keep image size consistent with the main Dockerfile.test/sanity/Dockerfile (1)
4-6: Consider adding--no-install-recommendsto reduce image size.The
apt-get installcommand pulls in recommended packages that aren't strictly required. Adding--no-install-recommendscan reduce the final image size.🔧 Proposed fix
# System deps (what a real user would have) RUN apt-get update && apt-get install -y \ + --no-install-recommends \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 4 - 6, Update the Dockerfile RUN apt-get install line to add the --no-install-recommends flag so apt doesn't pull recommended packages (modify the existing RUN line that starts with "apt-get update && apt-get install -y \ git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf /var/lib/apt/lists/*"); keep the existing -y and the rm -rf cleanup so the install behavior and cache cleanup remain unchanged.test/sanity/phases/smoke-tests.sh (3)
53-53: Severalcdcommands lack error handling.Some test functions have
cd "$WORKDIR" || return 1(lines 64, 101, 116) while others have barecd "$WORKDIR"without error handling. For consistency and safety, all should handlecdfailures.🔧 Proposed fix for test_discover_mcps (apply similar pattern to others)
test_discover_mcps() { - cd "$WORKDIR" + cd "$WORKDIR" || return 1 altimate_run "discover-mcps" --command discover-and-add-mcps "list"Also applies to: 75-75, 86-86, 127-127, 138-138, 149-149, 160-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` at line 53, Several places call cd "$WORKDIR" without error handling; update every bare cd "$WORKDIR" (including occurrences around the test functions like test_discover_mcps and the other test functions noted) to bail on failure by appending || return 1 so the test function exits if the chdir fails; search for all occurrences of cd "$WORKDIR" in this script and change them to cd "$WORKDIR" || return 1 so the tests consistently handle cd failures.
36-49:run_smoke_testfunction is defined but never used.This helper is defined but tests invoke themselves directly (e.g.,
${TESTS[$i]} &). Consider removing this dead code or refactoring the tests to use this wrapper.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 36 - 49, The helper function run_smoke_test is dead code; either remove it or refactor existing concurrent test invocations to use it. Locate run_smoke_test and either delete its definition, or replace places that spawn tests directly (e.g., usages like "${TESTS[$i]} &") with calls to run_smoke_test "$i" "$name" "${TESTS[$i]}" (pass an index, a name to create RESULT file, and the command plus args), ensuring you preserve the previous backgrounding/exit-code capture semantics implemented by run_smoke_test.
229-229: Unused variablenumwith incorrect calculation.The variable
numis calculated but never used, and the formula${#TESTS[@]} - ${#TEST_NAMES[@]} + 1always evaluates to 1 since both arrays have the same length.🔧 Proposed fix — remove unused variable
for name in "${TEST_NAMES[@]}"; do result=$(cat "$RESULTS_DIR/$name" 2>/dev/null || echo "MISSING") - num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1)) case "$result" in🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` at line 229, Remove the unused variable assignment "num=$((${`#TESTS`[@]} - ${`#TEST_NAMES`[@]} + 1))" since it is never used and the calculation is incorrect (both arrays have same length); delete that line and any related references to "num" in the script (leave TESTS and TEST_NAMES untouched) so the script no longer defines an unused variable.test/sanity/run.sh (1)
45-47: Separate declaration and assignment to catch file read errors.If
cat "$OLD_VERSION_FILE"fails, the error is masked andOLD_VERSIONbecomes empty silently. Separating allows proper error handling.🔧 Proposed fix
# Read old version from file set by Dockerfile.upgrade if [ -f "${OLD_VERSION_FILE:-}" ]; then - export OLD_VERSION=$(cat "$OLD_VERSION_FILE") + OLD_VERSION=$(cat "$OLD_VERSION_FILE") || OLD_VERSION="unknown" + export OLD_VERSION fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/run.sh` around lines 45 - 47, The current if-block sets OLD_VERSION by directly exporting the command substitution which can mask read errors; change it to first export/declare OLD_VERSION (or leave unset), then read the file into a variable using cat or read (referencing OLD_VERSION_FILE and OLD_VERSION) and check the command's exit status ($? or use set -e) so a failing read is detected and handled (e.g., log and exit non-zero) instead of silently leaving OLD_VERSION empty.test/sanity/lib/parallel.sh (2)
4-6: Consider separating declaration and assignment to avoid masking return values.While the fallback
|| echo "2"handles failures gracefully, separating declaration and assignment is a shell best practice.🔧 Proposed fix
detect_parallelism() { - local cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2") + local cores + cores=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo "2")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 4 - 6, In detect_parallelism(), avoid combining declaration and assignment which can mask return values; instead declare the variable first (local cores) and then assign cores via command substitution using the same fallback chain (nproc || sysctl -n hw.ncpu || echo "2") so failures from the commands aren’t masked and the function still returns the correct status; update references in detect_parallelism to use the separated variable.
35-38: Redundant cap — parallelism already maxes at 6.The
if [ "$parallel" -gt 6 ]check can never be true since the preceding conditionals only assign values 2, 3, 4, or 6. This block can be removed or kept as defensive code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/lib/parallel.sh` around lines 35 - 38, Remove the redundant defensive cap in test/sanity/lib/parallel.sh: delete the if block that checks `if [ "$parallel" -gt 6 ]; then parallel=6; fi` since `parallel` is already constrained to 2,3,4, or 6 earlier; if you prefer to keep a guard, replace the block with a brief comment explaining it's defensive and why it can never trigger. Ensure references to the `parallel` variable remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/Dockerfile.upgrade`:
- Around line 31-33: The Docker upgrade test fails because Dockerfile.upgrade's
COPY --chown=testuser dist/package.tgz expects dist/package.tgz in the build
context but CI doesn't produce it; fix by either adding a
docker-compose.upgrade.yml that builds the monorepo and produces
dist/package.tgz into the build context before docker build, or modify the
sanity:upgrade pipeline to run a pack step (e.g., run bun pm pack or npm pack in
the packages/opencode dist directory) to create dist/package.tgz prior to
invoking docker build, or change Dockerfile.upgrade to mirror the standard
Dockerfile approach (copying packages/opencode/dist/@altimateai/... instead of a
tarball) so npm run sanity:upgrade completes the COPY step successfully.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 230-247: The counters PASS_COUNT, FAIL_COUNT, and SKIP_COUNT are
never initialized before being incremented in the result-processing case block
(the case handling PASS/FAIL/SKIP/*), so initialize them to 0 before the loop
that processes results; add simple assignments PASS_COUNT=0 FAIL_COUNT=0
SKIP_COUNT=0 placed prior to the code that iterates over results so the
arithmetic increments succeed.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 31-40: The pipelines that feed the while-read loops in generate.sh
can cause the script to exit under set -euo pipefail because grep returns
non-zero when there are no matches; update the two grep pipelines that precede
the "while read -r f; do" blocks (the ones producing command-*.txt and
skills/.*/SKILL.md) so they never produce a non-zero exit (for example append
"|| true" to the grep command or use a grep -q check and guard the loop),
leaving the emit_test invocations (emit_test "command-${cmd}" ... and emit_test
"skill-${skill}" ...) unchanged.
- Around line 32-39: The emitted commands build shell strings from untrusted
filename-derived variables cmd and skill (used with emit_test) and can lead to
shell injection; fix by validating/sanitizing those values before use: when
parsing each changed file, constrain cmd and skill to a safe whitelist pattern
(e.g. reject unless they match /^[A-Za-z0-9._-]+$/) and skip/warn on mismatches,
or explicitly shell-escape them (e.g. using printf '%q' or equivalent) when
constructing the command string passed to emit_test; update the blocks that set
cmd and skill and all emit_test calls so they use the sanitized/escaped variable
names.
---
Duplicate comments:
In `@package.json`:
- Line 22: The package.json npm script "sanity:upgrade" references a missing
docker-compose file (docker-compose.upgrade.yml) which will cause failures; fix
by either adding the missing docker-compose.upgrade.yml under the test/sanity
test fixtures (ensure it contains the intended upgrade orchestration) or
remove/comment/disable the "sanity:upgrade" script from package.json until that
file and its contents are implemented; update or run the provided verification
shell snippet to confirm the file exists if you add it.
In `@test/sanity/Dockerfile`:
- Around line 32-35: The Dockerfile still hardcodes the arm64 build directory
(dist/@altimateai/altimate-code-linux-arm64) when linking the altimate binary;
update the RUN step to use Docker's build-time TARGETARCH instead: add an ARG
TARGETARCH (with optional default) at the top of the Dockerfile and replace the
hardcoded segment in the cp source path with the variable (e.g.,
dist/@altimateai/altimate-code-linux-${TARGETARCH}) so the cp in the RUN line
copies the correct arch-specific binary into /home/testuser/.local/bin and
preserves chmod +x and mkdir -p steps.
---
Nitpick comments:
In `@test/sanity/Dockerfile`:
- Around line 4-6: Update the Dockerfile RUN apt-get install line to add the
--no-install-recommends flag so apt doesn't pull recommended packages (modify
the existing RUN line that starts with "apt-get update && apt-get install -y \
git python3 python3-pip python3-venv curl sqlite3 \ && rm -rf
/var/lib/apt/lists/*"); keep the existing -y and the rm -rf cleanup so the
install behavior and cache cleanup remain unchanged.
In `@test/sanity/Dockerfile.upgrade`:
- Around line 3-5: Update the Dockerfile RUN installation command that currently
uses "apt-get install -y git python3 python3-pip python3-venv curl sqlite3" to
include the --no-install-recommends flag so recommended packages are not pulled
in; modify the RUN line (the one starting with "RUN apt-get update && apt-get
install -y") to add --no-install-recommends immediately after "apt-get install"
to keep image size consistent with the main Dockerfile.
In `@test/sanity/lib/parallel.sh`:
- Around line 4-6: In detect_parallelism(), avoid combining declaration and
assignment which can mask return values; instead declare the variable first
(local cores) and then assign cores via command substitution using the same
fallback chain (nproc || sysctl -n hw.ncpu || echo "2") so failures from the
commands aren’t masked and the function still returns the correct status; update
references in detect_parallelism to use the separated variable.
- Around line 35-38: Remove the redundant defensive cap in
test/sanity/lib/parallel.sh: delete the if block that checks `if [ "$parallel"
-gt 6 ]; then parallel=6; fi` since `parallel` is already constrained to 2,3,4,
or 6 earlier; if you prefer to keep a guard, replace the block with a brief
comment explaining it's defensive and why it can never trigger. Ensure
references to the `parallel` variable remain unchanged.
In `@test/sanity/phases/smoke-tests.sh`:
- Line 53: Several places call cd "$WORKDIR" without error handling; update
every bare cd "$WORKDIR" (including occurrences around the test functions like
test_discover_mcps and the other test functions noted) to bail on failure by
appending || return 1 so the test function exits if the chdir fails; search for
all occurrences of cd "$WORKDIR" in this script and change them to cd "$WORKDIR"
|| return 1 so the tests consistently handle cd failures.
- Around line 36-49: The helper function run_smoke_test is dead code; either
remove it or refactor existing concurrent test invocations to use it. Locate
run_smoke_test and either delete its definition, or replace places that spawn
tests directly (e.g., usages like "${TESTS[$i]} &") with calls to run_smoke_test
"$i" "$name" "${TESTS[$i]}" (pass an index, a name to create RESULT file, and
the command plus args), ensuring you preserve the previous
backgrounding/exit-code capture semantics implemented by run_smoke_test.
- Line 229: Remove the unused variable assignment "num=$((${`#TESTS`[@]} -
${`#TEST_NAMES`[@]} + 1))" since it is never used and the calculation is incorrect
(both arrays have same length); delete that line and any related references to
"num" in the script (leave TESTS and TEST_NAMES untouched) so the script no
longer defines an unused variable.
In `@test/sanity/run.sh`:
- Around line 45-47: The current if-block sets OLD_VERSION by directly exporting
the command substitution which can mask read errors; change it to first
export/declare OLD_VERSION (or leave unset), then read the file into a variable
using cat or read (referencing OLD_VERSION_FILE and OLD_VERSION) and check the
command's exit status ($? or use set -e) so a failing read is detected and
handled (e.g., log and exit non-zero) instead of silently leaving OLD_VERSION
empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f84b0cb7-5843-4150-ae71-977a15dafd19
📒 Files selected for processing (22)
package.jsontest/sanity/.env.exampletest/sanity/Dockerfiletest/sanity/Dockerfile.upgradetest/sanity/ci-local.shtest/sanity/docker-compose.ymltest/sanity/fixtures/broken-config.jsontest/sanity/fixtures/compaction-session.sqltest/sanity/fixtures/old-config.jsontest/sanity/fixtures/test.sqltest/sanity/lib/altimate-run.shtest/sanity/lib/assert.shtest/sanity/lib/cleanup.shtest/sanity/lib/parallel.shtest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/phases/verify-upgrade.shtest/sanity/pr-tests/generate.shtest/sanity/pr-tests/run-pr-tests.shtest/sanity/results/baseline.jsontest/sanity/run.sh
✅ Files skipped from review due to trivial changes (13)
- test/sanity/fixtures/old-config.json
- test/sanity/.env.example
- test/sanity/lib/cleanup.sh
- test/sanity/results/baseline.json
- test/sanity/fixtures/broken-config.json
- test/sanity/fixtures/test.sql
- test/sanity/fixtures/compaction-session.sql
- test/sanity/pr-tests/run-pr-tests.sh
- test/sanity/phases/verify-install.sh
- test/sanity/docker-compose.yml
- test/sanity/lib/altimate-run.sh
- test/sanity/lib/assert.sh
- test/sanity/phases/resilience.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- test/sanity/phases/verify-upgrade.sh
- test/sanity/ci-local.sh
299dbf8 to
1335dbd
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
test/sanity/Dockerfile (1)
34-37:⚠️ Potential issue | 🔴 CriticalMap Docker's
amd64to the publishedx64artifact name.On BuildKit/Compose x86_64 builds,
TARGETARCHisamd64, but the copied dist folder here isaltimate-code-linux-x64. The current path resolves to...-amd64/..., so the image build breaks on the most common host architecture.🐛 Proposed fix
ARG TARGETARCH -RUN ARCH="${TARGETARCH:-$(uname -m | sed 's/x86_64/x64/' | sed 's/aarch64/arm64/')}" && \ +RUN case "${TARGETARCH:-$(uname -m)}" in \ + amd64|x86_64) ARCH=x64 ;; \ + arm64|aarch64) ARCH=arm64 ;; \ + *) echo "Unsupported arch: ${TARGETARCH:-$(uname -m)}" >&2; exit 1 ;; \ + esac && \ mkdir -p /home/testuser/.local/bin && \ cp /home/testuser/.altimate-install/dist/@altimateai/altimate-code-linux-${ARCH}/bin/altimate /home/testuser/.local/bin/altimate && \Run this to confirm the mismatch between the Docker arg and the actual artifact names:
#!/bin/bash set -euo pipefail echo "Relevant Dockerfile lines:" sed -n '34,37p' test/sanity/Dockerfile echo echo "Available Linux artifacts:" find packages/opencode/dist/@altimateai -maxdepth 1 -type d -name 'altimate-code-linux-*' -exec basename {} \; | sortExpected result: you should see
altimate-code-linux-x64/altimate-code-linux-arm64, notaltimate-code-linux-amd64.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/Dockerfile` around lines 34 - 37, The build fails because TARGETARCH can be "amd64" but artifacts are named "x64"; update the ARCH calculation to map amd64->x64 (and keep aarch64->arm64) so the cp path points to the correct dist folder: change the ARCH assignment that currently uses sed substitutions to also replace "amd64" with "x64" (e.g., add a sed 's/amd64/x64/' step) so the cp in the RUN line copies from `@altimateai/altimate-code-linux-`${ARCH}/bin/altimate to /home/testuser/.local/bin/altimate successfully.test/sanity/phases/smoke-tests.sh (1)
37-152:⚠️ Potential issue | 🟠 MajorThese helper verdicts can still record
PASSfor failed runs.Most of these functions ignore the runner's exit status and only reject one or two substrings from the captured output. That means provider/auth/config failures — or even a missing capture file via
local output=$(get_output ...)— can still fall through toPASS, so the suite does not reliably prove command resolution, DuckDB, agent routing, or graceful bad-command handling. Require the command to succeed (or require the expected handled error fortest_bad_command) before writingPASS, and treat a missing output capture asFAIL.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/smoke-tests.sh` around lines 37 - 152, Many tests (test_discover_mcps, test_configure_claude, test_sql_analyze, test_duckdb, test_postgres, test_snowflake, test_builder, test_analyst, test_bad_command, test_discover) currently ignore the runner exit status and treat missing/empty captures as PASS; ensure each test first checks the exit code from altimate_run or altimate_run_with_turns and that get_output returned non-empty content before writing PASS, treating missing output as FAIL, and for test_bad_command require the command to produce the expected handled error (not just absence of "unhandled") before writing PASS; update each test to fail early if the runner exit code is non-zero (except where a specific handled error is expected), and only write PASS when both the exit code and output validation pass (use the existing RESULTS_DIR paths and functions altimate_run, altimate_run_with_turns, and get_output to implement the checks).test/sanity/phases/resilience.sh (1)
93-133:⚠️ Potential issue | 🟠 MajorThese resilience checks don't actually assert the advertised outcome.
The missing-key, old-config, and broken-config branches only verify that one crash string is absent. Empty output or an unrelated error will still count as success, so this does not really prove “clean missing-key error” or “backward-compatible config load.” Assert the expected user-facing message and/or exit path before marking these cases as passed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/phases/resilience.sh` around lines 93 - 133, The tests under "Missing API key handling", "Config backwards compat..." and "Broken config handling..." only assert absence of certain crash strings (using assert_not_contains) which allows empty or wrong outputs to pass; update the checks to assert the expected user-facing behavior instead: for the missing-key branch capture OUTPUT from the altimate run and assert that it contains the explicit help/error message (e.g., "API key not set" or the CLI's intended missing-key text) and that the command exited with a non-zero status; for the old-config branch after altimate_run use get_output and assert it contains the successful load message or expected prompt/response indicating backward compatibility instead of merely asserting no "parse error"; for the broken-config branch similarly assert the CLI prints a clear config parse failure message (or falls back to defaults) and check exit status; replace or augment the existing assert_not_contains calls with positive assertions (assert_contains and exit-code checks) referencing the test blocks by their echo markers and the helper calls altimate_run, get_output, OUTPUT, and assert_not_contains so the changes are easy to locate.test/sanity/pr-tests/generate.sh (1)
31-33:⚠️ Potential issue | 🔴 CriticalSanitize path-derived values before emitting eval’d commands.
Line 32 and Line 38 derive
cmd/skillfrom changed filenames and Line 33/Line 39 inject them into command strings. Since these manifest commands are executed downstream, this is a shell-injection vector.🔒 Suggested hardening
-echo "$changed" | grep "command/template/.*\.txt" 2>/dev/null | while read -r f; do +echo "$changed" | grep -E "command/template/.*\.txt" 2>/dev/null | while read -r f; do cmd=$(basename "$f" .txt) - emit_test "command-${cmd}" "altimate run --max-turns 1 --yolo --command ${cmd} test" + [[ "$cmd" =~ ^[A-Za-z0-9._-]+$ ]] || { echo " Skipping unsafe command name: $cmd"; continue; } + escaped_cmd=$(printf '%q' "$cmd") + emit_test "command-${cmd}" "altimate run --max-turns 1 --yolo --command ${escaped_cmd} test" done -echo "$changed" | grep "skills/.*/SKILL.md" 2>/dev/null | while read -r f; do +echo "$changed" | grep -E "skills/.*/SKILL.md" 2>/dev/null | while read -r f; do skill=$(basename "$(dirname "$f")") - emit_test "skill-${skill}" "ls ~/.altimate/builtin/${skill}/SKILL.md" + [[ "$skill" =~ ^[A-Za-z0-9._-]+$ ]] || { echo " Skipping unsafe skill name: $skill"; continue; } + escaped_skill=$(printf '%q' "$skill") + emit_test "skill-${skill}" "ls ~/.altimate/builtin/${escaped_skill}/SKILL.md" doneAlso applies to: 37-39
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` around lines 31 - 33, The script constructs command strings using untrusted basename-derived values (variables cmd and skill) and then emits/evaluates them, creating a shell-injection risk; update the emit path by validating/sanitizing those derived values before use: in the block that sets cmd=$(basename "$f" .txt) and skill=$(basename "$f" .txt) enforce a strict whitelist regex (e.g., only [a-z0-9_-]+) and skip or fail if the value doesn't match, or alternatively escape them safely (e.g., use shell-safe quoting like printf '%q' or pass them as separate arguments instead of interpolating into a single eval string) so emit_test is never given unsanitized input.
🧹 Nitpick comments (1)
test/sanity/pr-tests/generate.sh (1)
9-9: Use an explicit no-op for manifest truncation.Line 9 works in Bash, but ShellCheck SC2188 flags it. Using
: > "$MANIFEST"is clearer and avoids lint noise.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/sanity/pr-tests/generate.sh` at line 9, Replace the bare redirection used to truncate the manifest file (currently using > "$MANIFEST") with an explicit no-op redirection so ShellCheck SC2188 is satisfied; change the truncation invocation to use a no-op command redirect (i.e., use ':' as the command) targeting the MANIFEST variable to clearly indicate intended truncation in the generate.sh script.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/sanity/ci-local.sh`:
- Around line 59-80: The health-check loop sets HEALTHY but doesn't gate
downstream steps; add an explicit failure check after the loop to exit (or call
run_step failure behavior) if HEALTHY is less than 4 or if the prior docker
compose up failed. Concretely, after the loop that computes HEALTHY, test the
variable (e.g., if [ "$HEALTHY" -lt 4 ]; then echo "Services not healthy"; exit
1; fi) so that the subsequent run_step "Driver E2E (local)" / "Driver E2E
(docker)" and "Sanity Suite (Docker)" are not executed when docker compose up or
the health check failed; ensure the check references the HEALTHY variable and
the same docker compose invocation used earlier so failures short-circuit
immediately.
In `@test/sanity/Dockerfile`:
- Around line 27-30: The Docker RUN currently pipes "bun install 2>&1 | tail -5"
which hides bun's exit status; change it so the install's exit code is checked
and you still print the last lines: run bun install with output redirected to a
logfile and use && to tail it (for example run bun install >
/tmp/bun-install.log 2>&1 && tail -5 /tmp/bun-install.log), keeping the working
directory /home/testuser/.altimate-install and the existing package.json
creation; update the RUN line that references
"/home/testuser/.altimate-install", "bun install" and "tail -5" accordingly.
In `@test/sanity/pr-tests/generate.sh`:
- Around line 11-16: The current assignment to changed falls back to echo "" on
git errors, causing silent skip of PR tests; instead detect git command failures
and fail-open. Replace the single-line fallback with explicit checks: run the
primary git diff command, if it fails run the secondary, and if that also fails
print a clear error (including git's stderr) and exit non-zero; ensure the
variable changed is set only on success and the script exits 1 when neither git
diff command succeeds so tests are not silently skipped (refer to the changed
variable and the two git diff invocations).
---
Duplicate comments:
In `@test/sanity/Dockerfile`:
- Around line 34-37: The build fails because TARGETARCH can be "amd64" but
artifacts are named "x64"; update the ARCH calculation to map amd64->x64 (and
keep aarch64->arm64) so the cp path points to the correct dist folder: change
the ARCH assignment that currently uses sed substitutions to also replace
"amd64" with "x64" (e.g., add a sed 's/amd64/x64/' step) so the cp in the RUN
line copies from `@altimateai/altimate-code-linux-`${ARCH}/bin/altimate to
/home/testuser/.local/bin/altimate successfully.
In `@test/sanity/phases/resilience.sh`:
- Around line 93-133: The tests under "Missing API key handling", "Config
backwards compat..." and "Broken config handling..." only assert absence of
certain crash strings (using assert_not_contains) which allows empty or wrong
outputs to pass; update the checks to assert the expected user-facing behavior
instead: for the missing-key branch capture OUTPUT from the altimate run and
assert that it contains the explicit help/error message (e.g., "API key not set"
or the CLI's intended missing-key text) and that the command exited with a
non-zero status; for the old-config branch after altimate_run use get_output and
assert it contains the successful load message or expected prompt/response
indicating backward compatibility instead of merely asserting no "parse error";
for the broken-config branch similarly assert the CLI prints a clear config
parse failure message (or falls back to defaults) and check exit status; replace
or augment the existing assert_not_contains calls with positive assertions
(assert_contains and exit-code checks) referencing the test blocks by their echo
markers and the helper calls altimate_run, get_output, OUTPUT, and
assert_not_contains so the changes are easy to locate.
In `@test/sanity/phases/smoke-tests.sh`:
- Around line 37-152: Many tests (test_discover_mcps, test_configure_claude,
test_sql_analyze, test_duckdb, test_postgres, test_snowflake, test_builder,
test_analyst, test_bad_command, test_discover) currently ignore the runner exit
status and treat missing/empty captures as PASS; ensure each test first checks
the exit code from altimate_run or altimate_run_with_turns and that get_output
returned non-empty content before writing PASS, treating missing output as FAIL,
and for test_bad_command require the command to produce the expected handled
error (not just absence of "unhandled") before writing PASS; update each test to
fail early if the runner exit code is non-zero (except where a specific handled
error is expected), and only write PASS when both the exit code and output
validation pass (use the existing RESULTS_DIR paths and functions altimate_run,
altimate_run_with_turns, and get_output to implement the checks).
In `@test/sanity/pr-tests/generate.sh`:
- Around line 31-33: The script constructs command strings using untrusted
basename-derived values (variables cmd and skill) and then emits/evaluates them,
creating a shell-injection risk; update the emit path by validating/sanitizing
those derived values before use: in the block that sets cmd=$(basename "$f"
.txt) and skill=$(basename "$f" .txt) enforce a strict whitelist regex (e.g.,
only [a-z0-9_-]+) and skip or fail if the value doesn't match, or alternatively
escape them safely (e.g., use shell-safe quoting like printf '%q' or pass them
as separate arguments instead of interpolating into a single eval string) so
emit_test is never given unsanitized input.
---
Nitpick comments:
In `@test/sanity/pr-tests/generate.sh`:
- Line 9: Replace the bare redirection used to truncate the manifest file
(currently using > "$MANIFEST") with an explicit no-op redirection so ShellCheck
SC2188 is satisfied; change the truncation invocation to use a no-op command
redirect (i.e., use ':' as the command) targeting the MANIFEST variable to
clearly indicate intended truncation in the generate.sh script.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b587e827-6537-49cd-bfaf-82a71a842613
📒 Files selected for processing (23)
package.jsontest/sanity/.env.exampletest/sanity/Dockerfiletest/sanity/Dockerfile.upgradetest/sanity/ci-local.shtest/sanity/docker-compose.upgrade.ymltest/sanity/docker-compose.ymltest/sanity/fixtures/broken-config.jsontest/sanity/fixtures/compaction-session.sqltest/sanity/fixtures/old-config.jsontest/sanity/fixtures/test.sqltest/sanity/lib/altimate-run.shtest/sanity/lib/assert.shtest/sanity/lib/cleanup.shtest/sanity/lib/parallel.shtest/sanity/phases/resilience.shtest/sanity/phases/smoke-tests.shtest/sanity/phases/verify-install.shtest/sanity/phases/verify-upgrade.shtest/sanity/pr-tests/generate.shtest/sanity/pr-tests/run-pr-tests.shtest/sanity/results/baseline.jsontest/sanity/run.sh
✅ Files skipped from review due to trivial changes (10)
- test/sanity/docker-compose.upgrade.yml
- test/sanity/fixtures/compaction-session.sql
- test/sanity/.env.example
- test/sanity/fixtures/old-config.json
- test/sanity/pr-tests/run-pr-tests.sh
- test/sanity/lib/cleanup.sh
- test/sanity/fixtures/broken-config.json
- test/sanity/results/baseline.json
- test/sanity/docker-compose.yml
- test/sanity/lib/altimate-run.sh
🚧 Files skipped from review as they are similar to previous changes (4)
- test/sanity/fixtures/test.sql
- test/sanity/phases/verify-upgrade.sh
- test/sanity/phases/verify-install.sh
- test/sanity/lib/assert.sh
1335dbd to
ab5d0f9
Compare
Docker-based "new user simulator" that tests the shipped npm artifact: - Phase 1: verify-install (9 checks: binary, skills, napi, dbt, git) - Phase 2: smoke-tests (10 E2E tests via altimate run, parallelized) - Phase 3: resilience (8 tests: SQLite, WAL, sessions, compaction, config) - PR-aware test generation (git diff → targeted tests) - Local CI pipeline (bun run ci → typecheck + tests + markers) - Machine-aware parallelism (2-6 concurrent based on cores/RAM) 27 tests, all passing in ~2:48 on 20-core machine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ab5d0f9 to
a0c4e72
Compare
Summary
Docker-based "new user simulator" that tests the shipped npm artifact end-to-end — catches bugs that existing unit tests miss.
altimate run --yolo --format json: commands resolve, sql_analyze works, DuckDB driver, agent routing, graceful error handlinggit diffanalysis generates targeted tests for what changedbun run ciruns typecheck + tests + markers in one command27 tests, all passing in ~2:48 on a 20-core machine.
How to use
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit