Skip to content

Error passing#3

Open
SethPaul wants to merge 8 commits into
masterfrom
error_passing
Open

Error passing#3
SethPaul wants to merge 8 commits into
masterfrom
error_passing

Conversation

@SethPaul
Copy link
Copy Markdown
Contributor

@SethPaul SethPaul commented Nov 27, 2025

Summary by CodeRabbit

  • New Features

    • Added support for loading cluster definitions from CloudFormation stacks, with automatic fallback to YAML configuration.
  • Bug Fixes

    • Improved error messages to include the failed command and output for better debugging.

✏️ Tip: You can customize this high-level summary in your review settings.

- Add 60-second TTL caching for AWS API responses to reduce API calls
  during multi-task deployments
- Parallelize independent AWS API calls (stack outputs, instances,
  routing stack) using threads for faster loading
- Add SSM Parameter Store support for production RTP channel hosts
  with fallback to static defaults
- Add routing stack discovery for dev boxes to get frontend_host
- Add clear_cache! method for testing and forced refresh

Session: claude -r session
When running tasks remotely, pass the stack name via CLUSTER_DEF_STACK
environment variable so the remote node can load cluster definition
from CloudFormation even if the instance isn't tagged with the CF
stack name.

Uses 'rvmsudo env VAR=value' to pass the env var through sudo.

Session: claude -r session
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 27, 2025

Walkthrough

The PR introduces CloudFormation-based cluster definition loading via a new Zapt::ClusterDefCF module with cached, parallelized AWS data retrieval and IMDSv2 support. Integrates into runtask with YAML fallback. Updates system utilities for output capture control, redirects logger to STDERR, and enhances error messaging. Bumps version to 1.1.0.

Changes

Cohort / File(s) Summary
CloudFormation Cluster Definition Loader
lib/zapt/cluster_def_cf.rb
New module providing cached, parallelized loading of cluster definitions from CloudFormation stacks, EC2/ASG instances, SSM parameters, and routing stacks. Includes IMDSv2-backed metadata access, stack/output/parameter retrieval, node construction, and RTP host parsing. Exposes load_from_stack(stack_name, region) and load_for_current_instance entry points.
Cluster Configuration Integration
lib/zapt/commands/runtask.rb
Adds load_cluster_config(cluster) method supporting CloudFormation-first loading with YAML fallback, controlled by CLUSTER_DEF_SOURCE environment variable. Propagates CLUSTER_DEF_STACK to remote task execution context.
Logging and Core Utilities
lib/zapt/logger.rb, lib/zapt/system.rb, lib/zapt/tasks/shell_task.rb
Logger redirects output to STDERR. system method adds capture parameter and combines stdout/stderr in output. shell_task enhances error messages with command details and truncated output (>2000 chars).
Version Bump
lib/zapt/version.rb
Updates VERSION constant from '1.0.1' to '1.1.0'.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant CF as ClusterDefCF
    participant IMDS as IMDSv2
    participant AWS as AWS APIs
    participant Cache as Local Cache
    
    App->>CF: load_from_stack(stack_name, region)
    CF->>Cache: cached_fetch(key)
    alt Cache Hit
        Cache-->>CF: cluster_def
    else Cache Miss
        CF->>IMDS: get_instance_metadata
        IMDS-->>CF: instance metadata
        par Parallel Fetches
            CF->>AWS: get_stack_outputs
            CF->>AWS: get_stack_instances
            CF->>AWS: find_routing_stack
        and
            AWS-->>CF: outputs
            AWS-->>CF: instances
            AWS-->>CF: routing stack
        end
        CF->>AWS: get_asg_instances
        AWS-->>CF: asg instances
        CF->>AWS: get_prod_rtp_channel_hosts
        AWS-->>CF: rtp hosts
        CF->>CF: build_node_from_instance
        CF->>Cache: store(key, cluster_def, ttl)
        Cache-->>CF: cached
    end
    CF-->>App: cluster_def hash
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Attention areas:
    • CloudFormation cluster loader (cluster_def_cf.rb): AWS API interactions, caching mechanism with TTL, parallelized fetching logic, IMDSv2 implementation, error handling for missing stacks/instances
    • Integration in runtask: logic for determining cluster source (CloudFormation vs YAML), environment variable propagation, fallback handling
    • System utility changes: capture parameter interaction with command execution, stderr redirection in logging
    • Error handling changes in shell_task: output truncation logic and message construction

Possibly related PRs

  • ruby 3 change set #2: Updates Zapt::VERSION constant; both PRs modify the same version constant but to different values.

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Error passing' is vague and generic, not clearly describing the main changes in the changeset. Use a more descriptive title that captures the primary change, such as 'Add CloudFormation-based cluster definition loader' or 'Introduce error handling improvements and cluster config loading'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch error_passing

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/zapt/system.rb (1)

8-8: Unused parameter capture.

The capture parameter is added to the method signature but is never used in the method body. If this parameter is intended to control output capture behavior, the implementation appears incomplete.

-  def system cmd, user=nil, host=nil, pem=nil, quiet=false, ignore_failure=false, capture=true
+  def system cmd, user=nil, host=nil, pem=nil, quiet=false, ignore_failure=false

Or implement the intended behavior based on the capture flag.

🧹 Nitpick comments (2)
lib/zapt/commands/runtask.rb (1)

121-130: Consider adding trailing comma for consistency.

Per RuboCop style convention, adding a trailing comma to multiline arrays makes diffs cleaner when adding new items.

         possible_paths = [
           cluster_name,
           "#{cluster_name}.yaml",
           File.join(Dir.pwd, 'cluster_defs', "#{cluster_name}.yaml"),
-          File.join(Dir.pwd, '..', 'common', 'cluster_defs', "#{cluster_name}.yaml")
+          File.join(Dir.pwd, '..', 'common', 'cluster_defs', "#{cluster_name}.yaml"),
         ]
lib/zapt/cluster_def_cf.rb (1)

457-468: JSON detection heuristic is acceptable but fragile.

Using start_with?('{') to detect JSON is a reasonable heuristic, and the rescue JSON::ParserError provides a safe fallback. However, this could be slightly more robust.

Consider a more defensive approach:

       # Try to parse as JSON first (handles complex A/B structures)
-      if rtp_input.start_with?('{')
+      if rtp_input.strip.start_with?('{')
         begin
           parsed = JSON.parse(rtp_input)
+          if parsed.is_a?(Hash) && (parsed.key?('A') || parsed.key?('B'))
             return {
               A: Array(parsed['A']),
               B: Array(parsed['B'])
             }
+          end
         rescue JSON::ParserError
           # Fall through to simple URL handling
         end
       end
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d7b40b and d4abef3.

📒 Files selected for processing (6)
  • lib/zapt/cluster_def_cf.rb (1 hunks)
  • lib/zapt/commands/runtask.rb (3 hunks)
  • lib/zapt/logger.rb (1 hunks)
  • lib/zapt/system.rb (1 hunks)
  • lib/zapt/tasks/shell_task.rb (1 hunks)
  • lib/zapt/version.rb (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
lib/zapt/commands/runtask.rb (1)
lib/zapt/cluster_def_cf.rb (1)
  • load_from_stack (303-401)
🪛 Brakeman (7.1.1)
lib/zapt/system.rb

[medium] 26-26: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)

lib/zapt/cluster_def_cf.rb

[medium] 230-230: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 264-264: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 66-66: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 87-87: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 126-126: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 147-147: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 251-251: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)


[medium] 105-105: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)

🪛 RuboCop (1.81.7)
lib/zapt/commands/runtask.rb

[convention] 7-8: Extra empty line detected at module body beginning.

(Layout/EmptyLinesAroundModuleBody)


[convention] 94-138: Assignment Branch Condition size for load_cluster_config is too high. [<10, 22, 16> 28.98/23]

(Metrics/AbcSize)


[convention] 94-138: Cyclomatic complexity for load_cluster_config is too high. [13/7]

(Metrics/CyclomaticComplexity)


[convention] 94-138: Perceived complexity for load_cluster_config is too high. [14/8]

(Metrics/PerceivedComplexity)


[convention] 127-127: Put a comma after the last item of a multiline array.

(Style/TrailingCommaInArrayLiteral)


[convention] 137-137: Prefer using YAML.safe_load over YAML.load.

(Security/YAMLLoad)

lib/zapt/cluster_def_cf.rb

[convention] 17-482: Module has too many lines. [328/100]

(Metrics/ModuleLength)


[convention] 27-27: Put a comma after the last item of a multiline array.

(Style/TrailingCommaInArrayLiteral)


[convention] 32-32: Put a comma after the last item of a multiline array.

(Style/TrailingCommaInArrayLiteral)


[convention] 101-119: Cyclomatic complexity for get_stack_outputs is too high. [9/7]

(Metrics/CyclomaticComplexity)


[convention] 101-119: Perceived complexity for get_stack_outputs is too high. [9/8]

(Metrics/PerceivedComplexity)


[convention] 122-140: Cyclomatic complexity for get_stack_parameters is too high. [9/7]

(Metrics/CyclomaticComplexity)


[convention] 122-140: Perceived complexity for get_stack_parameters is too high. [9/8]

(Metrics/PerceivedComplexity)


[convention] 184-184: Put a comma after the last item of a multiline array.

(Style/TrailingCommaInArrayLiteral)


[convention] 246-277: Assignment Branch Condition size for get_asg_instances is too high. [<14, 23, 18> 32.39/23]

(Metrics/AbcSize)


[convention] 246-277: Cyclomatic complexity for get_asg_instances is too high. [17/7]

(Metrics/CyclomaticComplexity)


[convention] 246-277: Perceived complexity for get_asg_instances is too high. [17/8]

(Metrics/PerceivedComplexity)


[convention] 303-401: Assignment Branch Condition size for load_from_stack is too high. [<24, 63, 36> 76.43/23]

(Metrics/AbcSize)


[convention] 303-401: Cyclomatic complexity for load_from_stack is too high. [35/7]

(Metrics/CyclomaticComplexity)


[convention] 303-401: Perceived complexity for load_from_stack is too high. [36/8]

(Metrics/PerceivedComplexity)


[convention] 406-442: Assignment Branch Condition size for load_for_current_instance is too high. [<9, 22, 9> 25.42/23]

(Metrics/AbcSize)


[convention] 406-442: Cyclomatic complexity for load_for_current_instance is too high. [9/7]

(Metrics/CyclomaticComplexity)


[convention] 406-442: Perceived complexity for load_for_current_instance is too high. [9/8]

(Metrics/PerceivedComplexity)


[convention] 445-480: Cyclomatic complexity for parse_rtp_hosts is too high. [10/7]

(Metrics/CyclomaticComplexity)


[convention] 445-480: Perceived complexity for parse_rtp_hosts is too high. [10/8]

(Metrics/PerceivedComplexity)

🔇 Additional comments (9)
lib/zapt/version.rb (1)

4-4: LGTM!

Minor version bump to 1.1.0 is appropriate for the new CloudFormation-based cluster loading feature and enhanced error handling.

lib/zapt/logger.rb (1)

6-6: LGTM!

Redirecting logs to STDERR is the correct approach for CLI tools. This cleanly separates log messages from program output, enabling proper output capture without log pollution.

lib/zapt/tasks/shell_task.rb (1)

27-39: LGTM!

Enhanced error handling that includes truncated command output is valuable for debugging failed commands. The truncation strategy (preserving the trailing 2000 characters) is appropriate since errors typically appear at the end of output.

lib/zapt/system.rb (1)

25-26: LGTM!

Capturing stderr alongside stdout (2>&1) ensures error messages are included in the output, which supports the enhanced error reporting in shell_task.rb.

Regarding the Brakeman command injection warning: this is expected behavior for a task runner that needs to execute arbitrary shell commands. The risk is mitigated by the tool's design purpose.

lib/zapt/commands/runtask.rb (3)

31-44: LGTM!

The cluster config loading flow and stack name extraction logic is well-structured. Extracting the base name for both YAML files and stack names ensures consistent behavior.


52-62: LGTM!

The environment variable propagation using rvmsudo env is the correct approach for passing CLUSTER_DEF_STACK through sudo to remote tasks.


106-118: LGTM!

The CloudFormation-first with YAML fallback approach is cleanly implemented. The error handling appropriately re-raises in strict cloudformation mode while gracefully falling back in auto mode.

lib/zapt/cluster_def_cf.rb (2)

258-259: Potential nil dereference with safe navigation chain.

The chained safe navigation &.select { ... }&.map { ... } could still result in nil if asg['Instances'] is nil. However, line 259 correctly handles this with instance_ids.nil? || instance_ids.empty?.

The nil handling is correct. Good defensive coding.


406-442: LGTM!

The auto-detection logic is well-structured with clear error messages for each failure case. The method correctly identifies the instance, retrieves its stack, loads the cluster definition, and sets this_node.

Comment on lines +1 to +14
# CloudFormation-based cluster definition loader for zapt
# This module provides functions to load cluster definitions from CloudFormation
# stack outputs and EC2 instance tags instead of YAML files.
#
# Usage:
# Set CLUSTER_DEF_SOURCE=cloudformation to use CF-based loading
# Or call Zapt::ClusterDefCF.load_from_stack(stack_name) directly
#
# Features:
# - Caches AWS API responses for 60 seconds to reduce API calls
# - Parallelizes independent AWS API calls for faster loading
# - Loads production RTP hosts from SSM Parameter Store (with fallback)
#
require 'json'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Consider adding require 'shellwords' for safe command construction.

Given the multiple shell command constructions throughout this module, adding Shellwords at the top would enable proper escaping of all interpolated values.

 require 'json'
+require 'shellwords'

 module Zapt
   module ClusterDefCF

This should be combined with escaping all interpolated values in shell commands as noted in other comments.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# CloudFormation-based cluster definition loader for zapt
# This module provides functions to load cluster definitions from CloudFormation
# stack outputs and EC2 instance tags instead of YAML files.
#
# Usage:
# Set CLUSTER_DEF_SOURCE=cloudformation to use CF-based loading
# Or call Zapt::ClusterDefCF.load_from_stack(stack_name) directly
#
# Features:
# - Caches AWS API responses for 60 seconds to reduce API calls
# - Parallelizes independent AWS API calls for faster loading
# - Loads production RTP hosts from SSM Parameter Store (with fallback)
#
require 'json'
# CloudFormation-based cluster definition loader for zapt
# This module provides functions to load cluster definitions from CloudFormation
# stack outputs and EC2 instance tags instead of YAML files.
#
# Usage:
# Set CLUSTER_DEF_SOURCE=cloudformation to use CF-based loading
# Or call Zapt::ClusterDefCF.load_from_stack(stack_name) directly
#
# Features:
# - Caches AWS API responses for 60 seconds to reduce API calls
# - Parallelizes independent AWS API calls for faster loading
# - Loads production RTP hosts from SSM Parameter Store (with fallback)
#
require 'json'
require 'shellwords'
🤖 Prompt for AI Agents
In lib/zapt/cluster_def_cf.rb around lines 1 to 14, the file builds shell
commands with interpolated values but does not require or use Shellwords for
safe escaping; add require 'shellwords' near the top of the file and update all
places that interpolate external values into shell commands to wrap those values
with Shellwords.escape (or use Shellwords.join for arrays) so every dynamic
token is properly escaped before being passed to the shell.

Comment on lines +18 to +20
# Cache for AWS API responses to reduce API calls during multi-task deployments
@cache = {}
@cache_ttl = 60 # seconds
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Thread safety concern with cache initialization.

The @cache and @cache_ttl class instance variables are initialized at module load time, but cached_fetch also does @cache ||= {} on line 44. With the parallel threads in load_from_stack, concurrent access to @cache without synchronization could cause race conditions.

Consider using a Mutex to synchronize cache access:

     @cache = {}
     @cache_ttl = 60 # seconds
+    @cache_mutex = Mutex.new

+    # ... in cached_fetch method:
+    def cached_fetch(cache_key, &block)
+      @cache_mutex.synchronize do
+        @cache ||= {}
+        if @cache[cache_key] && (Time.now - @cache[cache_key][:time]) < (@cache_ttl || 60)
+          return @cache[cache_key][:data]
+        end
+        data = block.call
+        @cache[cache_key] = { data: data, time: Time.now }
+        data
+      end
+    end

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In lib/zapt/cluster_def_cf.rb around lines 18 to 20 and referencing cached_fetch
at line 44, the class-level @cache and @cache_ttl are not protected against
concurrent access leading to race conditions; add a Mutex (e.g., @cache_mutex)
initialized at load time alongside @cache and @cache_ttl, and then wrap all
reads/writes in cached_fetch (and any other methods touching @cache) inside
mutex.synchronize blocks so initialization, TTL checks, and cache updates are
atomic; remove or avoid relying on `@cache ||= {}` in cached_fetch and use the
mutex-protected instance to ensure consistent thread-safe behavior.

Comment on lines +59 to +68
def get_instance_metadata(path)
cached_fetch("metadata:#{path}") do
# Get token for IMDSv2
token = `curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" 2>/dev/null`.strip
next nil if token.empty?

# Use token to get metadata
result = `curl -s -H "X-aws-ec2-metadata-token: #{token}" "http://169.254.169.254/latest/meta-data/#{path}" 2>/dev/null`.strip
result.empty? ? nil : result
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential command injection via metadata token interpolation.

While the token comes from the EC2 metadata service (trusted), and path is internally controlled, interpolating the token directly into a shell command could be problematic if the metadata service ever returns unexpected characters.

Consider using Shellwords.escape for defense in depth:

+require 'shellwords'

 def get_instance_metadata(path)
   cached_fetch("metadata:#{path}") do
     # Get token for IMDSv2
     token = `curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" 2>/dev/null`.strip
     next nil if token.empty?

     # Use token to get metadata
-    result = `curl -s -H "X-aws-ec2-metadata-token: #{token}" "http://169.254.169.254/latest/meta-data/#{path}" 2>/dev/null`.strip
+    result = `curl -s -H "X-aws-ec2-metadata-token: #{Shellwords.escape(token)}" "http://169.254.169.254/latest/meta-data/#{Shellwords.escape(path)}" 2>/dev/null`.strip
     result.empty? ? nil : result
   end
 end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_instance_metadata(path)
cached_fetch("metadata:#{path}") do
# Get token for IMDSv2
token = `curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" 2>/dev/null`.strip
next nil if token.empty?
# Use token to get metadata
result = `curl -s -H "X-aws-ec2-metadata-token: #{token}" "http://169.254.169.254/latest/meta-data/#{path}" 2>/dev/null`.strip
result.empty? ? nil : result
end
require 'shellwords'
def get_instance_metadata(path)
cached_fetch("metadata:#{path}") do
# Get token for IMDSv2
token = `curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" 2>/dev/null`.strip
next nil if token.empty?
# Use token to get metadata
result = `curl -s -H "X-aws-ec2-metadata-token: #{Shellwords.escape(token)}" "http://169.254.169.254/latest/meta-data/#{Shellwords.escape(path)}" 2>/dev/null`.strip
result.empty? ? nil : result
end
end
🧰 Tools
🪛 Brakeman (7.1.1)

[medium] 66-66: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)

🤖 Prompt for AI Agents
In lib/zapt/cluster_def_cf.rb around lines 59–68, the code interpolates the
IMDSv2 token and path directly into backticked shell commands, risking command
injection; replace the current backtick shell invocations with a safe approach:
either escape user-supplied values using Shellwords.escape for both token and
path before inserting into the command, or—preferably—avoid the shell entirely
by calling an API that executes the HTTP requests without a shell (e.g.,
Open3.capture3 or a Net::HTTP/URI client) and use the returned strings,
preserving the existing logic for empty-token and empty-result handling.

Comment on lines +83 to +98
def get_instance_tags(instance_id, region = nil)
region ||= get_current_region
cached_fetch("tags:#{instance_id}:#{region}") do
cmd = "aws ec2 describe-tags --filters \"Name=resource-id,Values=#{instance_id}\" --region #{region} --output json 2>/dev/null"
result = `#{cmd}`
next {} if result.empty?

tags = {}
JSON.parse(result)['Tags'].each do |tag|
tags[tag['Key']] = tag['Value']
end
tags
end
rescue JSON::ParserError
{}
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Command injection vulnerability in AWS CLI calls.

The instance_id and region parameters are interpolated directly into shell commands without escaping. If these values come from untrusted sources (e.g., malformed EC2 tags), this could allow command injection.

Use Shellwords.escape to sanitize all shell arguments:

+require 'shellwords'

 def get_instance_tags(instance_id, region = nil)
   region ||= get_current_region
   cached_fetch("tags:#{instance_id}:#{region}") do
-    cmd = "aws ec2 describe-tags --filters \"Name=resource-id,Values=#{instance_id}\" --region #{region} --output json 2>/dev/null"
+    cmd = "aws ec2 describe-tags --filters \"Name=resource-id,Values=#{Shellwords.escape(instance_id)}\" --region #{Shellwords.escape(region)} --output json 2>/dev/null"
     result = `#{cmd}`
     # ...
   end
 end

Apply the same pattern to get_stack_outputs (line 104), get_stack_parameters (line 125), get_ssm_parameter (line 146), get_stack_instances (line 229), and get_asg_instances (lines 250, 263).

Alternatively, consider using the AWS SDK for Ruby (aws-sdk-ec2, aws-sdk-cloudformation, etc.) instead of shelling out, which would eliminate injection risks entirely.

🧰 Tools
🪛 Brakeman (7.1.1)

[medium] 87-87: Possible command injection
Type: Command Injection
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/command_injection/

(Command Injection)

🤖 Prompt for AI Agents
lib/zapt/cluster_def_cf.rb lines 83-98: the AWS CLI invocation interpolates
instance_id and region directly into a shell command; require 'shellwords' at
the top of the file and wrap all shell-inserted values with Shellwords.escape
(e.g., escape instance_id and region) before building the command, and apply the
same escaping pattern to get_stack_outputs (≈line 104), get_stack_parameters
(≈line 125), get_ssm_parameter (≈line 146), get_stack_instances (≈line 229), and
get_asg_instances (≈lines 250, 263); alternatively replace shelling out with the
appropriate aws-sdk-* clients to eliminate shell invocation entirely.

Comment on lines +212 to +216
shared_comms = "https://comms-dev1.wootmath.com/"

if params['CreateComms'] == 'true' && params['Identifier'] && params['PrivateHostedZoneName']
vm_comms = "https://comm-#{params['Identifier']}.#{params['PrivateHostedZoneName']}"
info[:rtp_channel_hosts] = { A: [shared_comms], B: [vm_comms] }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent URL formatting (trailing slash).

shared_comms has a trailing slash while vm_comms does not. This inconsistency could cause issues if downstream code expects a specific format.

-      shared_comms = "https://comms-dev1.wootmath.com/"
+      shared_comms = "https://comms-dev1.wootmath.com"

       if params['CreateComms'] == 'true' && params['Identifier'] && params['PrivateHostedZoneName']
         vm_comms = "https://comm-#{params['Identifier']}.#{params['PrivateHostedZoneName']}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
shared_comms = "https://comms-dev1.wootmath.com/"
if params['CreateComms'] == 'true' && params['Identifier'] && params['PrivateHostedZoneName']
vm_comms = "https://comm-#{params['Identifier']}.#{params['PrivateHostedZoneName']}"
info[:rtp_channel_hosts] = { A: [shared_comms], B: [vm_comms] }
shared_comms = "https://comms-dev1.wootmath.com"
if params['CreateComms'] == 'true' && params['Identifier'] && params['PrivateHostedZoneName']
vm_comms = "https://comm-#{params['Identifier']}.#{params['PrivateHostedZoneName']}"
info[:rtp_channel_hosts] = { A: [shared_comms], B: [vm_comms] }
🤖 Prompt for AI Agents
In lib/zapt/cluster_def_cf.rb around lines 212 to 216, the two comms URLs are
formatted inconsistently (shared_comms ends with a trailing slash while vm_comms
does not); normalize them so downstream code gets a consistent format — either
remove the trailing slash from shared_comms or ensure both variables are
stripped of trailing slashes (e.g., rstrip('/') or equivalent) when assigning,
or consistently append a trailing slash to vm_comms; apply the same
normalization wherever these values are set or consumed.

Comment on lines +313 to +321
threads = []
threads << Thread.new { outputs = get_stack_outputs(stack_name, region) }
threads << Thread.new { instances = get_stack_instances(stack_name, region) }
threads << Thread.new { routing_stack = find_routing_stack(stack_name, region) }
threads.each(&:join)

if outputs.empty?
raise Zapt::Error.new("CloudFormation stack '#{stack_name}' not found or has no outputs")
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unhandled exceptions in parallel threads.

If any thread raises an exception, it's silently swallowed and the corresponding variable remains at its initial value. For example, if get_stack_outputs raises, outputs stays nil, and outputs.empty? on line 319 will raise NoMethodError.

Propagate exceptions from threads:

       threads = []
       threads << Thread.new { outputs = get_stack_outputs(stack_name, region) }
       threads << Thread.new { instances = get_stack_instances(stack_name, region) }
       threads << Thread.new { routing_stack = find_routing_stack(stack_name, region) }
-      threads.each(&:join)
+      threads.each do |t|
+        t.join
+        raise t.value if t.status.nil? && t.value.is_a?(Exception)
+      end

-      if outputs.empty?
+      if outputs.nil? || outputs.empty?
         raise Zapt::Error.new("CloudFormation stack '#{stack_name}' not found or has no outputs")
       end

Alternatively, use Thread.abort_on_exception = true within the block or use Concurrent::Future from the concurrent-ruby gem for better thread management.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In lib/zapt/cluster_def_cf.rb around lines 313 to 321, exceptions raised inside
the three Thread.new blocks are being swallowed so variables (outputs,
instances, routing_stack) can remain nil and later cause unrelated
NoMethodError; change the pattern so exceptions inside threads are propagated to
the main thread — either set Thread.abort_on_exception = true temporarily, or
better: capture each thread in an array and after spawning call thread.join then
thread.value (or use threads.map(&:value)) to re-raise any thread exceptions in
the main thread, and assign variables from those thread results rather than
relying on mutated outer-scope variables.

end

$logger.info "Loading cluster from YAML: #{yaml_path}" if $logger
YAML::load(IO.read(yaml_path))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use YAML.safe_load to prevent arbitrary code execution.

YAML.load can execute arbitrary Ruby code via deserialization attacks if the YAML file contains malicious !ruby/object tags. Use YAML.safe_load with permitted classes and symbols.

-      YAML::load(IO.read(yaml_path))
+      YAML.safe_load(IO.read(yaml_path), permitted_classes: [], permitted_symbols: [], aliases: true, symbolize_names: true)

Note: If the YAML files use symbol keys (:name: syntax) rather than string keys converted via symbolize_names, you may need to adjust the parsing or the YAML file format.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
YAML::load(IO.read(yaml_path))
YAML.safe_load(IO.read(yaml_path), permitted_classes: [], permitted_symbols: [], aliases: true, symbolize_names: true)
🧰 Tools
🪛 RuboCop (1.81.7)

[convention] 137-137: Prefer using YAML.safe_load over YAML.load.

(Security/YAMLLoad)

🤖 Prompt for AI Agents
In lib/zapt/commands/runtask.rb around line 137, replace the unsafe
YAML.load(IO.read(yaml_path)) with YAML.safe_load, passing an explicit list of
permitted classes and symbols (or an empty list if none are needed) and
disabling unsafe aliases if not required; also ensure keys are normalized (e.g.,
convert symbol keys to strings or use transform to symbolize keys explicitly) so
downstream code expecting symbol/string keys still works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant