Skip to content

feat(deps-scan): add dependency scanning, callgraph support and deps Docker images#23

Open
matiasdaloia wants to merge 5 commits intomainfrom
feature/mdaloia/dependency-scanning
Open

feat(deps-scan): add dependency scanning, callgraph support and deps Docker images#23
matiasdaloia wants to merge 5 commits intomainfrom
feature/mdaloia/dependency-scanning

Conversation

@matiasdaloia
Copy link
Contributor

@matiasdaloia matiasdaloia commented Mar 3, 2026

  • New Features

    • Dependency scanning for Go, Java, Python, Rust with call‑chain tracing and attribution
    • CLI flags: --scan-dependencies, --export-callgraph (JSON/DOT/Text) and related options
    • New multi‑arch "Deps" Docker images (amd64/arm64) providing full language toolchains
    • Parallel/cached dependency scanning and exportable call‑graph outputs; interim report format bumped to 1.2
  • Documentation

    • New Dependency Scanning guide, updated Docker usage (Deps image) and OUTPUT_FORMATS examples

How to test

Prerequisites

  • Pull this branch and build locally:
    git pull
    make build
    ./target/crypto-finder version
  • Ensure API access is configured:
    export SCANOSS_API_KEY="<key>"
    # or: ./target/crypto-finder configure --api-key "<key>"

1) Local binary validation (primary)

Run against each test repository provided validation.

REPO=/absolute/path/to/test-repo
OUT=/tmp/crypto-finder-po
mkdir -p "$OUT"

# Baseline
./target/crypto-finder scan --output "$OUT/baseline.json" "$REPO"

# Dependency scan (reachable only)
./target/crypto-finder scan --scan-dependencies --output "$OUT/deps-reachable.json" "$REPO"

# Dependency scan (include unreachable)
./target/crypto-finder scan --scan-dependencies --dep-include-unreachable --output "$OUT/deps-all.json" "$REPO"

2) Docker validation (build + execution)

make docker-build
make docker-build-deps
make docker-test

Run dependency scan with deps image:

mkdir -p "$OUT/docker"
docker run --rm \
  -v "$REPO":/workspace/code:ro \
  -v "$OUT/docker":/workspace/output \
  -e SCANOSS_API_KEY="$SCANOSS_API_KEY" \
  ghcr.io/scanoss/crypto-finder:latest-deps \
  scan --scan-dependencies --output /workspace/output/deps.json /workspace/code

Optional unreachable run:

docker run --rm \
  -v "$REPO":/workspace/code:ro \
  -v "$OUT/docker":/workspace/output \
  -e SCANOSS_API_KEY="$SCANOSS_API_KEY" \
  ghcr.io/scanoss/crypto-finder:latest-deps \
  scan --scan-dependencies --dep-include-unreachable --output /workspace/output/deps-all.json /workspace/code

What to expect

  • Dependency-scan outputs are JSON version: "1.2".
  • Findings may contain:
    • source: "direct" for user code.
    • source: "dependency" for dependency-attributed findings.
  • Dependency findings include dependency_info (module, version, function).
  • Reachable findings include call_chains.
  • deps-all.json should have same or more findings than deps-reachable.json (never fewer).

Quick checks:

jq '.version' "$OUT/deps-reachable.json"
jq '[.findings[].cryptographic_assets[]?.source] | group_by(.) | map({source: .[0], count: length})' "$OUT/deps-reachable.json"
jq '[.findings[].cryptographic_assets[]?] | length' "$OUT/deps-reachable.json"
jq '[.findings[].cryptographic_assets[]?] | length' "$OUT/deps-all.json"

Acceptance criteria

  • Local binary scans complete successfully on all provided test repos.
  • Docker images build and run successfully.
  • Dependency attribution fields (source, dependency_info, call_chains) are present where applicable.
  • Reachability behavior is correct (--dep-include-unreachable increases or preserves findings count).

Summary by CodeRabbit

  • New Features

    • Dependency scanning across Go, Java, Python, and Rust
    • New CLI flags to enable dependency scans and export call graphs (JSON)
    • New "deps" Docker image tier for unified dependency scanning tooling
  • Documentation

    • Added comprehensive Dependency Scanning docs with examples and CI usage
  • Output Formats

    • JSON/report format bumped to v1.3 with dependency attribution and call-chain fields
  • Bug Fixes

    • Docker build compatibility and CGO/static linking fixes for container builds

@matiasdaloia matiasdaloia self-assigned this Mar 3, 2026
@matiasdaloia matiasdaloia added the enhancement New feature or request label Mar 3, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end dependency scanning: new language parsers (Go/Java/Python/Rust), ecosystem resolvers (Go/Maven/Pip/Cargo), a DependencyScanner with parallel per-dependency scans, call‑graph builder & tracer, disk findings cache, CLI flags/export, Deps Docker images, schema bump to interim v1.3, docs, and extensive tests.

Changes

Cohort / File(s) Summary
Docker images & release
\.goreleaser.yml, Dockerfile, Dockerfile.slim, Dockerfile.deps, Dockerfile.goreleaser.deps, Dockerfile.goreleaser, Dockerfile.test, Makefile
Add multi-arch -deps image builds/manifests and goreleaser Dockerfile for deps; introduce multi-stage Deps Dockerfiles bundling language toolchains and scanners; Makefile targets to build/push/run dependency-scan images; update OpenGrep installer URL.
Call graph core & formats
internal/callgraph/types.go, internal/callgraph/builder.go, internal/callgraph/tracer.go, internal/callgraph/format.go, internal/callgraph/text_helpers.go, internal/callgraph/format_test.go, internal/callgraph/core_test.go
Introduce in-memory CallGraph model, FunctionDecl/ID, builder to aggregate functions from packages, tracer for backward BFS, multi-format exporters (JSON/DOT/Text) and helpers plus tests.
Language parsers & registry
internal/callgraph/go_parser.go, internal/callgraph/java_parser.go, internal/callgraph/python_parser.go, internal/callgraph/rust_parser.go, internal/callgraph/parser_registry.go, internal/callgraph/*_test.go
Add tree-sitter based parsers for Go/Java/Python/Rust with ParseDirectory/SkipDirs/SubPackagePath/PackageSeparator; registry factory NewParserForEcosystem; extensive parser tests.
Dependency resolvers & source cache
internal/dependency/resolver.go, internal/dependency/go_resolver.go, internal/dependency/maven_resolver.go, internal/dependency/pip_resolver.go, internal/dependency/cargo_resolver.go, internal/dependency/source_cache.go, internal/dependency/registry.go, internal/dependency/*_test.go
Define Resolver interface and ResolveResult; implement Go/Maven/Pip/Cargo resolvers; SourceCache for extracting sources/jars with Zip-safety and size limits; registry to register/get resolvers; unit/integration tests.
Dependency scanning engine & cache
internal/engine/dependency_scanner.go, internal/engine/findings_cache.go, internal/engine/findings_cache_test.go, internal/engine/*_test.go, internal/engine/rule_filter.go
Add DependencyScanner orchestration (resolve → per-dep scans → call‑graph build → trace/attribute → merge), disk-backed FindingsCache (Get/Put, atomic writes), rule-language filtering and rule-hash computation, and tests.
CLI & scan utilities
internal/cli/scan.go, internal/cli/scan_test.go, internal/scan/*.go
Add CLI flags (--scan-dependencies, dep-* flags, --export-callgraph, --export-callgraph-format), integrate dependency scanning into scan flow, centralize validation/duration/count/summary/export helpers in internal/scan.
Interim report schema & entities
internal/entities/interim.go, schemas/interim-report-schema.json, internal/deduplicator/deduplicator.go
Bump interim format to v1.3; add CryptographicAsset fields source, dependency_info (module/version/function), call_chains; update JSON schema and add start_line/end_line; small deduplicator iteration fix.
Docs & changelog
README.md, docs/DEPENDENCY_SCANNING.md, docs/DOCKER_USAGE.md, docs/OUTPUT_FORMATS.md, CHANGELOG.md, CONTRIBUTING.md
Add comprehensive dependency-scanning docs, Deps image usage, output format v1.3 docs, CLI flag docs, changelog entry for 0.3.0, and contributing/lint guidance.
Tests & testdata
testdata/projects/*, many internal/*_test.go
Add multi-language test projects and extensive unit/integration tests covering parsers, resolvers, source cache, engine, formatting, CLI validation, and end-to-end dependency scenarios.
Utilities & infra
internal/scan/*, internal/utils/fd.go, internal/utils/fd_test.go, internal/scanner/*, internal/version/version.go, .github/workflows/lint.yml, .golangci-lint-version, .gitignore
Add scan helper package (duration parsing, ecosystem detection, validation, count/summary/export), safe FD conversion util and tests, spinner/terminal safety updates, lint directives, CI linter pin file, and minor infra updates.
Go dependencies
go.mod
Add github.com/smacker/go-tree-sitter and yaml v3 entries for parsing and rule-language extraction.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (crypto-finder)
    participant Orch as Orchestrator
    participant Resolver as Resolver (go/maven/pip/cargo)
    participant Scanner as DependencyScanner
    participant Builder as CallGraph Builder
    participant Parser as Parser (Go/Java/Python/Rust)
    participant Tracer as Tracer

    CLI->>Orch: run scan (with --scan-dependencies)
    Orch->>Resolver: resolve dependencies for target
    Resolver-->>Orch: ResolveResult (deps, dirs, root)
    Orch->>Scanner: ScanWithDependencies(userReport, opts)
    Scanner->>Scanner: load/filter rules, compute rules hash
    par per-dep
        Scanner->>Orch: request per-dep scan (worker)
        Orch-->>Scanner: per-dep InterimReport
    end
    Scanner->>Builder: BuildFromDirectories(user + deps)
    Builder->>Parser: ParseDirectory(...) (per package)
    Parser-->>Builder: FileAnalysis (functions & calls)
    Builder-->>Scanner: CallGraph
    Scanner->>Tracer: TraceBack(findings)
    Tracer-->>Scanner: Call chains to user code
    Scanner-->>CLI: Merged report + exported call graph
Loading
sequenceDiagram
    participant File as Source file
    participant Parser as Language Parser
    participant Analysis as FileAnalysis
    participant CallGraph as CallGraph Builder

    File->>Parser: ParseDirectory(file)
    Parser->>Analysis: extract imports, functions, calls
    Analysis-->>Parser: FileAnalysis
    Parser->>CallGraph: emit FunctionDecls & Calls
    CallGraph-->>Parser: aggregated CallGraph with callers index
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • feat: add deduplication logic #16: Modifies internal/deduplicator/deduplicator.go—related to deduplication iteration changes present here.
  • feat: add OID mapping #18: Modifies internal/entities/interim.go and asset metadata—strongly related to the interim format and new asset attribution fields.

Poem

🐰 I hopped through code and traced each call,

Across Go, Java, Python, Rust I sprawl.
I cached and scanned each dependency tree,
Then stitched call‑chains back to user code for thee.
A crunchy carrot for your CI — hooray! 🥕

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/mdaloia/dependency-scanning

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🔍 SCANOSS Code Similarity Detected

📄 2 snippet matches found
📋 3 full file matches found

🔗 View detailed findings on commit 8f2b5cd

Files with similarities:

  • testdata/projects/go_with_crypto_dep/mypkg/crypto.go
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/service/CryptoService.java
  • testdata/projects/cryptowrapper_dep/go.sum
  • testdata/projects/go_with_crypto_dep/go.sum
  • testdata/projects/go_with_dep_chain/go.sum

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 8
  • Undeclared components: 4
  • Declared components: 4
  • Detected files: 105
  • Detected files undeclared: 5
  • Detected files declared: 100
  • Licenses detected: 7
  • Licenses detected with copyleft: 2
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 4
  • Undeclared components: 0
  • Declared components: 4
  • Detected files: 98
  • Detected files undeclared: 0
  • Detected files declared: 98
  • Licenses detected: 4
  • Licenses detected with copyleft: 2
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (35)
internal/scan/duration.go-23-40 (1)

23-40: ⚠️ Potential issue | 🟠 Major

Fix permissive parsing and fractional truncation in custom d/w duration formats.

The current implementation has two correctness issues:

  1. fmt.Sscanf("%f") accepts a leading float and silently ignores trailing characters (e.g., "1.5abc" parses as 1.5). This should reject malformed input.
  2. The cast-before-multiply pattern truncates fractional values: time.Duration(value*24) casts the float to int64, losing fractional hours. For example, 0.1d becomes 2h instead of 2h24m.

Use strict float parsing with strconv.ParseFloat and multiply the full float value by the duration unit before casting:

Proposed fix
 import (
 	"fmt"
+	"strconv"
 	"strings"
 	"time"
 )
@@
 	if strings.HasSuffix(s, "d") {
 		days := strings.TrimSuffix(s, "d")
-		var value float64
-		n, parseErr := fmt.Sscanf(days, "%f", &value)
-		if parseErr != nil || n != 1 {
+		value, parseErr := strconv.ParseFloat(days, 64)
+		if parseErr != nil {
 			return 0, fmt.Errorf("invalid duration format: %s", s)
 		}
-		return time.Duration(value*24) * time.Hour, nil
+		return time.Duration(value * float64(24*time.Hour)), nil
 	}
@@
 	if strings.HasSuffix(s, "w") {
 		weeks := strings.TrimSuffix(s, "w")
-		var value float64
-		n, parseErr := fmt.Sscanf(weeks, "%f", &value)
-		if parseErr != nil || n != 1 {
+		value, parseErr := strconv.ParseFloat(weeks, 64)
+		if parseErr != nil {
 			return 0, fmt.Errorf("invalid duration format: %s", s)
 		}
-		return time.Duration(value*24*7) * time.Hour, nil
+		return time.Duration(value * float64(7*24*time.Hour)), nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/duration.go` around lines 23 - 40, The d/week parsing blocks
incorrectly use fmt.Sscanf which permissively accepts trailing garbage and they
cast the float to time.Duration too early, truncating fractional days/weeks;
update both suffix handlers (the "d" and "w" branches) to use
strconv.ParseFloat(valueStr, 64) to strictly parse the whole string (return an
error on parse failure), then compute the duration by multiplying the parsed
float by the appropriate unit as floats (e.g., value * 24 * float64(time.Hour)
for days, value * 7 * 24 * float64(time.Hour) for weeks) and only cast to
time.Duration after the multiplication so fractional hours/minutes are
preserved.
internal/engine/orchestrator.go-71-73 (1)

71-73: ⚠️ Potential issue | 🟠 Major

RulePaths bypass contract is currently broken for empty preloaded slices.

The comment says non-nil should bypass rule loading, but the implementation uses len(opts.RulePaths) > 0. An intentionally empty preloaded slice falls back to loading all rules. Also copy the slice to avoid downstream mutation of caller-owned data.

🔧 Proposed fix
- // RulePaths, when non-nil, bypasses the rules manager and uses these paths directly.
+ // RulePaths, when non-nil, bypasses the rules manager and uses these paths directly.
  // This is used by the dependency scanner to pass pre-loaded, language-filtered rules.
  RulePaths []string
@@
- if len(opts.RulePaths) > 0 {
-   rulePaths = opts.RulePaths
+ if opts.RulePaths != nil {
+   rulePaths = append([]string(nil), opts.RulePaths...)
    log.Debug().Int("count", len(rulePaths)).Msg("Using pre-loaded rule paths")
  } else {

Also applies to: 106-108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/orchestrator.go` around lines 71 - 73, The bypass logic for
preloaded rules should treat a non-nil RulePaths slice as authoritative even if
empty and must copy the slice to avoid caller mutation: change checks that use
len(opts.RulePaths) > 0 to nil-checks (opts.RulePaths != nil) wherever RulePaths
is inspected (e.g., the fields/initialization that references RulePaths in
orchestrator.go around the struct and the code paths at lines ~71 and ~106-108),
and when assigning opts.RulePaths to internal state or passing it downstream,
allocate and copy the slice contents into a new slice so the orchestrator owns
its copy.
schemas/interim-report-schema.json-118-126 (1)

118-126: ⚠️ Potential issue | 🟠 Major

Enforce sourcedependency_info consistency.

The schema currently allows "source": "dependency" without dependency_info (and vice versa), which makes dependency attribution ambiguous.

🔧 Proposed schema constraint
     "CryptographicAsset": {
       "type": "object",
@@
       "properties": {
@@
         "call_chains": {
@@
         }
-      },
+      },
+      "allOf": [
+        {
+          "if": {
+            "properties": { "source": { "const": "dependency" } }
+          },
+          "then": { "required": ["dependency_info"] }
+        },
+        {
+          "if": { "required": ["dependency_info"] },
+          "then": {
+            "properties": { "source": { "const": "dependency" } }
+          }
+        }
+      ],
       "oneOf": [
         {"required": ["rules"]},
         {"required": ["rule"]}
       ]
     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@schemas/interim-report-schema.json` around lines 118 - 126, The schema allows
mismatched combinations of "source" and "dependency_info"; add a conditional
constraint at the same object level as the "source" and "dependency_info"
properties: use JSON Schema "if"/"then"/"else" where if {"properties":
{"source": {"const": "dependency"}}} then {"required": ["dependency_info"]} else
{"not": {"required": ["dependency_info"]}} so that dependency_info is required
exactly when source === "dependency" (and prohibited when source !==
"dependency"); refer to the "source" property and the "$ref":
"#/$defs/DependencyInfo" dependency_info entry when inserting this conditional.
internal/engine/findings_cache.go-115-117 (1)

115-117: ⚠️ Potential issue | 🟠 Major

Harden cache filename derivation for cross-platform safety.

Replacing only / is not enough for Windows path separators/reserved filename characters. A hash-based filename avoids traversal/invalid-name edge cases.

🔧 Proposed safe key-to-filename mapping
 func cacheKeyToFilename(key string) string {
-  safe := strings.ReplaceAll(key, "/", "_")
-  return safe + ".json"
+  sum := sha256.Sum256([]byte(key))
+  return hex.EncodeToString(sum[:]) + ".json"
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/findings_cache.go` around lines 115 - 117, The current
cacheKeyToFilename function only replaces "/" and can produce invalid or
traversable filenames on other platforms; change cacheKeyToFilename to derive
filenames by computing a stable hash (e.g., SHA-256 or SHA-1) of the input key
and use the hex digest as the filename (optionally with a short readable prefix
or truncated original-safe token) plus the ".json" extension, ensuring fixed
length, no path separators, and deterministic mapping for lookup/eviction in
functions that call cacheKeyToFilename.
internal/engine/findings_cache.go-129-135 (1)

129-135: ⚠️ Potential issue | 🟠 Major

Include boundaries in ComputeRulesHash input.

Hashing only concatenated content can produce ambiguous inputs across different file splits; include path/length delimiters per file.

🔧 Proposed deterministic boundary-safe hashing
 import (
@@
+  "strconv"
@@
   for _, p := range sorted {
     content, err := os.ReadFile(p)
     if err != nil {
       return "", fmt.Errorf("failed to read rule file %s: %w", p, err)
     }
-    h.Write(content)
+    h.Write([]byte(p))
+    h.Write([]byte{0})
+    h.Write([]byte(strconv.Itoa(len(content))))
+    h.Write([]byte{0})
+    h.Write(content)
+    h.Write([]byte{0})
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/findings_cache.go` around lines 129 - 135, ComputeRulesHash
currently only writes concatenated file contents into the hash (loop over sorted
files), which can yield ambiguous collisions across different file splits;
modify the loop in ComputeRulesHash to include deterministic boundaries by
writing the file path and a length or separator before each file's content
(e.g., h.Write([]byte(path)), h.Write([]byte{0}) or
h.Write([]byte(fmt.Sprintf("%d:", len(content)))) then h.Write(content)) so each
file’s name and size are part of the hashed input and ordering remains stable.
internal/engine/findings_cache.go-96-103 (1)

96-103: ⚠️ Potential issue | 🟠 Major

Use unique temp files in Put to avoid concurrent write collisions.

tmp := path + ".tmp" makes concurrent writes to the same key contend on one temp file, causing rename/cleanup races.

🔧 Proposed concurrency-safe write flow
- tmp := path + ".tmp"
- if err := os.WriteFile(tmp, data, 0o600); err != nil {
-   return fmt.Errorf("failed to write cache file: %w", err)
- }
+ tmpFile, err := os.CreateTemp(c.dir, cacheKeyToFilename(key)+".*.tmp")
+ if err != nil {
+   return fmt.Errorf("failed to create temp cache file: %w", err)
+ }
+ tmp := tmpFile.Name()
+ if _, err := tmpFile.Write(data); err != nil {
+   _ = tmpFile.Close()
+   _ = os.Remove(tmp)
+   return fmt.Errorf("failed to write cache file: %w", err)
+ }
+ if err := tmpFile.Close(); err != nil {
+   _ = os.Remove(tmp)
+   return fmt.Errorf("failed to close temp cache file: %w", err)
+ }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/findings_cache.go` around lines 96 - 103, The current Put
implementation uses a fixed tmp filename (tmp := path + ".tmp") causing
concurrent writers to collide; change Put to create a unique temp file per write
(e.g., use os.CreateTemp or generate a unique suffix with process id +
random/UUID) in the same directory, write to that unique tmp file, fsync if
needed, close it, then call os.Rename(uniqueTmp, path); on rename error ensure
you attempt to remove that specific uniqueTmp and treat os.ErrNotExist specially
as before; update references to tmp and the error-handling block around
os.Rename to operate on the generated unique temp file name and keep file mode
0o600.
Makefile-129-136 (1)

129-136: ⚠️ Potential issue | 🟠 Major

Validate PROJ before running docker-scan-deps.

When PROJ is unset, the volume mount becomes invalid and the command fails with a confusing Docker error.

Suggested fix
 docker-scan-deps: ## Scan with dependency resolution (PROJ=path/to/project)
 	`@echo` "Scanning with dependency resolution..."
+	`@test` -n "$(PROJ)" || (echo "❌ PROJ is required. Usage: make docker-scan-deps PROJ=/path/to/project"; exit 1)
 	`@docker` run --rm \
 		-v $(PROJ):/workspace/code:ro \
 		--entrypoint sh $(DOCKER_IMAGE):latest-deps -c \
 		"if [ -f /workspace/code/requirements.txt ]; then pip install -r /workspace/code/requirements.txt 2>/dev/null; fi; \
 		 crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 129 - 136, The docker-scan-deps Makefile target should
validate the PROJ variable before attempting to run docker; update the
docker-scan-deps recipe to check that PROJ is non-empty (and optionally that the
path exists) and print a clear error and exit if not set, instead of invoking
docker with an invalid -v mount. Modify the docker-scan-deps target (referencing
PROJ, DOCKER_IMAGE, and ARGS) to perform the guard check at the start of the
recipe and abort early with a helpful message when PROJ is missing or invalid.
Dockerfile.deps-60-109 (1)

60-109: ⚠️ Potential issue | 🟠 Major

Run the final image as a non-root user.

The runtime stage has no USER directive, so all scanning executes as root.

Suggested hardening
 # Create workspace directory
 WORKDIR /workspace
+
+# Drop root privileges for runtime
+RUN useradd --create-home --uid 10001 scanner && \
+    chown -R scanner:scanner /workspace /opt/target-venv
+USER scanner
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.deps` around lines 60 - 109, The image currently runs as root (no
USER directive) so add a non-root user and switch to it before ENTRYPOINT/CMD:
create a user/group (e.g., adduser --disabled-password --gecos "" scanner or
addgroup/adduser with UID 1000), chown relevant paths (WORKDIR /workspace,
/opt/target-venv, /root/.cargo and any other runtime directories) and then add a
USER scanner line after those chown steps and before ENTRYPOINT
["crypto-finder"]; ensure the verification RUN (crypto-finder version ...) runs
in build as root but runtime ENTRYPOINT/CMD execute as the new non-root user.
Dockerfile.deps-55-77 (1)

55-77: ⚠️ Potential issue | 🟠 Major

Pin OpenGrep and Rust installer versions with checksum verification.

The OpenGrep installer (line 55) fetches and executes the latest script from the main branch without version pinning or integrity checks. Similarly, the Rust installer (line 76) downloads the latest rustup convenience script from sh.rustup.rs without verification.

For OpenGrep: Download a versioned release binary from GitHub and verify its SHA256 checksum, or use a pinned Docker image digest.

For Rust: Use the versioned rustup-init archive URL pattern (https://static.rust-lang.org/rustup/archive/{version}/{target}/rustup-init) with SHA256 checksum verification, or replace with the official Rust Docker image pinned by digest.

Both approaches eliminate reproducibility gaps and strengthen supply-chain integrity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.deps` around lines 55 - 77, The Dockerfile uses unpinned curl|bash
installs for OpenGrep and Rust which breaks reproducibility and supply-chain
safety; replace the OpenGrep pipe with a download of a specific GitHub release
artifact (use the release tag/URL for the installer or binary), verify its
SHA256 checksum (echo "<expected_sum>  <file>" | sha256sum -c -) before
executing or copying it into the image, then run the verified installer or move
the binary into /usr/local/bin; for Rust, stop using the generic sh.rustup.rs
pipe and either download the versioned rustup-init archive from
static.rust-lang.org using the archive URL pattern with the correct
{version}/{target}/rustup-init, verify its SHA256 checksum the same way, chmod
+x and run it, or alternatively pull a pinned official Rust Docker image digest
and copy the toolchain (or use that image as a build stage) to ensure
deterministic versions; update the two RUN commands that call the curl piped
installers to perform these steps and fail the build if checksum verification
does not match.
Dockerfile.goreleaser.deps-61-61 (1)

61-61: ⚠️ Potential issue | 🟠 Major

Use SPDX-accurate OCI license metadata.

Line 61 should match the project’s declared license variant (GPL-2.0-only) to avoid compliance ambiguity.

✅ One-line fix
-LABEL org.opencontainers.image.licenses="GPL-2.0"
+LABEL org.opencontainers.image.licenses="GPL-2.0-only"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.goreleaser.deps` at line 61, The OCI label value for license is
using the ambiguous variant; update the LABEL with key
org.opencontainers.image.licenses (in the Dockerfile.goreleaser.deps) to use the
SPDX-accurate identifier "GPL-2.0-only" instead of "GPL-2.0" so the metadata
exactly matches the project's declared license variant.
internal/dependency/go_resolver.go-35-38 (1)

35-38: ⚠️ Potential issue | 🟠 Major

maxDepth contract is documented but not implemented.

The resolver advertises depth limiting, but the parameter is ignored. This can silently scan full transitive dependencies when bounded traversal was requested.

🛠️ Minimal safety fix (until depth pruning is implemented)
-func (r *GoResolver) Resolve(ctx context.Context, targetDir string, _ int) (*ResolveResult, error) {
+func (r *GoResolver) Resolve(ctx context.Context, targetDir string, maxDepth int) (*ResolveResult, error) {
+	if maxDepth >= 0 {
+		return nil, fmt.Errorf("go resolver: maxDepth is not implemented yet")
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/go_resolver.go` around lines 35 - 38, The Resolve method
on GoResolver currently ignores the maxDepth parameter; add a minimal safety
check at the start of GoResolver.Resolve that honors maxDepth by returning
immediately when maxDepth == 0 (i.e., do not traverse any dependencies) — return
an empty/zero-value *ResolveResult (or a clearly named sentinel) and nil error —
and add a TODO comment referencing full depth-limited traversal to implement
later; keep the change localized to the Resolve function and ensure the maxDepth
parameter name is used so future callers/tests detect the contract is enforced.
Dockerfile.goreleaser.deps-25-25 (1)

25-25: ⚠️ Potential issue | 🟠 Major

Pin OpenGrep and Rust toolchain versions for reproducible builds.

Line 25 uses curl | bash from the main branch and lines 44-45 use rustup stable, both non-deterministic. This is inconsistent with Semgrep (pinned to 1.145.0) and Go (pinned to 1.25-alpine) elsewhere in the file. Pin both to specific versions for reproducibility and supply-chain security.

Suggested fix
 # Install OpenGrep (minimum version 1.12.1)
+ARG OPENGREP_INSTALL_REF=v1.12.1
-RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/main/install.sh | bash
+RUN curl -fsSLo /tmp/opengrep-install.sh "https://raw.githubusercontent.com/opengrep/opengrep/${OPENGREP_INSTALL_REF}/install.sh" \
+    && bash /tmp/opengrep-install.sh \
+    && rm -f /tmp/opengrep-install.sh

 # ── Rust/Cargo (for dependency resolution only, not compilation) ──
+ARG RUST_TOOLCHAIN=1.85.0
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | \
-    sh -s -- -y --default-toolchain stable --profile minimal
+    sh -s -- -y --default-toolchain ${RUST_TOOLCHAIN} --profile minimal
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.goreleaser.deps` at line 25, Replace the non-deterministic
installer and rust channel with pinned versions: change the RUN curl -fsSL
https://raw.githubusercontent.com/opengrep/opengrep/main/install.sh | bash
invocation to fetch a specific OpenGrep tag or commit (e.g. use the raw URL for
a git tag/commit instead of "main") and update the rustup usage that currently
invokes the stable channel (the "rustup stable" or "rustup default stable"
invocation) to install or set a concrete toolchain version (e.g. rustup install
1.xx.x && rustup default 1.xx.x) so both the OpenGrep installer and Rust
toolchain are version-pinned for reproducible builds.
internal/dependency/cargo_resolver.go-97-115 (1)

97-115: ⚠️ Potential issue | 🟠 Major

Graph keys collapse distinct crate versions.

Reducing node IDs to bare crate names merges parallel versions and can corrupt dependency edges.

Suggested fix
-	// Build ID→name mapping for resolve graph
+	// Build ID→moduleKey mapping for resolve graph (module@version)
 	for _, node := range meta.Resolve.Nodes {
-		name := cargoPackageNameFromID(node.ID)
-		pkgByID[node.ID] = name
+		pkgByID[node.ID] = cargoPackageNameFromID(node.ID) // should preserve version in key
 	}
@@
-		parentName := pkgByID[node.ID]
-		if parentName == "" {
+		parentKey := pkgByID[node.ID]
+		if parentKey == "" {
 			continue
 		}
 		for _, dep := range node.Deps {
-			childName := pkgByID[dep.Pkg]
-			if childName == "" {
-				childName = cargoPackageNameFromID(dep.Pkg)
+			childKey := pkgByID[dep.Pkg]
+			if childKey == "" {
+				childKey = cargoPackageNameFromID(dep.Pkg)
 			}
-			result.Graph[parentName] = append(result.Graph[parentName], childName)
+			result.Graph[parentKey] = append(result.Graph[parentKey], childKey)
 		}
 	}
// Prefer a stable key that preserves version, e.g. "crate@1.2.3".

Also applies to: 150-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/cargo_resolver.go` around lines 97 - 115, The code
currently collapses distinct crate versions by using only
cargoPackageNameFromID(node.ID) as graph keys; change the pkgByID mapping and
all uses (including the other block around the later loop) to generate and use a
stable key that preserves version (e.g. "crate@1.2.3") instead of bare name:
derive the version for each node (or lookup version from meta.Packages) and
build pkgByID[node.ID] = crateName + "@" + version (or add a helper like
cargoPackageKeyFromID that returns "name@version"), then use that key when
assigning parentName and childName and when appending to result.Graph so
parallel versions do not collapse.
internal/scan/validation.go-31-33 (1)

31-33: ⚠️ Potential issue | 🟠 Major

Handle inaccessible targets, not only missing targets.

os.Stat errors other than non-existence (e.g., permission denied) currently pass validation and fail later with less context.

Suggested fix
-	if _, err := os.Stat(target); os.IsNotExist(err) {
-		return nil, fmt.Errorf("target path does not exist: %s", target)
-	}
+	if _, err := os.Stat(target); err != nil {
+		if os.IsNotExist(err) {
+			return nil, fmt.Errorf("target path does not exist: %s", target)
+		}
+		return nil, fmt.Errorf("cannot access target path %s: %w", target, err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/validation.go` around lines 31 - 33, The os.Stat check only
treats os.IsNotExist errors and lets other errors (e.g., permission denied)
pass; update the validation to handle all os.Stat errors for the given target by
checking if err != nil and returning a descriptive error including the
underlying err (e.g., "target path inaccessible" or "target path does not exist:
%w") so callers get immediate, specific feedback; refer to the os.Stat(target)
call and the target variable in this validation function and propagate the
original error in the returned fmt.Errorf.
internal/callgraph/tracer.go-79-84 (1)

79-84: ⚠️ Potential issue | 🟠 Major

Stop tracing once user code is reached.

Current logic only finalizes when there are no callers, so chains may continue past user code into external callers.

Suggested fix
 		// Get the current head of the chain (the function we're tracing from)
 		head := current.chain[0]
 		headKey := head.Function.String()
+
+		// Terminate at first user-code boundary
+		if isUserPackage(head.Function.Package, userPackages, t.pkgSep) && len(current.chain) > 1 {
+			results = append(results, CallChain{Steps: current.chain})
+			continue
+		}
 
 		// Find all callers of the head function
 		callers := t.graph.Callers[headKey]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/tracer.go` around lines 79 - 84, The tracer currently only
finalizes a CallChain when there are no callers, allowing chains to continue
past user code; modify the caller-processing logic (the loop handling "callers"
and "current.chain") to detect when chainReachesUserCode(current.chain,
userPackages, t.pkgSep) is true (and len(current.chain) > 1) and in that case
append a CallChain{Steps: current.chain} to results and skip enqueuing any
further callers (i.e., continue to next item), so tracing stops once user code
is reached even if callers exist.
internal/dependency/cargo_resolver.go-61-61 (1)

61-61: ⚠️ Potential issue | 🟠 Major

maxDepth is ignored in Rust dependency resolution.

This makes --dep-max-depth ineffective for Cargo projects.

Suggested direction
-func (r *CargoResolver) Resolve(ctx context.Context, targetDir string, _ int) (*ResolveResult, error) {
+func (r *CargoResolver) Resolve(ctx context.Context, targetDir string, maxDepth int) (*ResolveResult, error) {
@@
+	// Apply depth pruning from workspace/root modules before returning.
+	// result.Graph = pruneGraphByDepth(result.Graph, roots, maxDepth)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/cargo_resolver.go` at line 61, The Resolve method on
CargoResolver currently ignores the incoming maxDepth parameter (the third param
in func (r *CargoResolver) Resolve(ctx context.Context, targetDir string, _ int)
(*ResolveResult, error)), making --dep-max-depth ineffective; update the
signature to name the parameter (e.g., maxDepth int) and thread that value into
the cargo dependency traversal logic used by CargoResolver (limit
recursion/breadth of any recursive calls or graph walks inside Resolve and any
helper methods it calls), enforcing the depth check and returning a truncated
ResolveResult when the maxDepth limit is reached. Ensure any helper functions
(e.g., the recursive resolver methods) accept and respect the maxDepth/countdown
parameter so the CLI flag actually constrains Rust dependency resolution.
internal/callgraph/tracer.go-69-70 (1)

69-70: ⚠️ Potential issue | 🟠 Major

Depth limiting is off by one.

The condition uses node count, so direct-caller chains can be dropped when maxDepth=1.

Suggested fix
-		if maxDepth > 0 && len(current.chain) > maxDepth {
+		// depth in edges = nodes - 1
+		if maxDepth > 0 && (len(current.chain)-1) > maxDepth {
 			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/tracer.go` around lines 69 - 70, The depth check in
tracer.go uses node count directly and thus drops direct-caller chains when
maxDepth is meant to be an edge count; update the condition that currently
checks if maxDepth > 0 && len(current.chain) > maxDepth to account for edges by
comparing len(current.chain) against maxDepth+1 (e.g., use maxDepth > 0 &&
len(current.chain) > maxDepth+1 or equivalently len(current.chain)-1 >
maxDepth), so that direct caller chains are preserved when maxDepth=1; adjust
the check around current.chain accordingly.
internal/dependency/source_cache.go-75-95 (1)

75-95: ⚠️ Potential issue | 🟠 Major

Do not report extraction success when file extraction fails.

Current loop ignores per-file errors and still returns a cache directory, which can persist incomplete/corrupt cached sources.

Suggested fix
+	extractedCount := 0
+	var firstExtractErr error
 	for _, f := range r.File {
 		// Filter by extension if specified
 		if len(extensions) > 0 && !hasMatchingExtension(f.Name, extensions) {
 			continue
 		}
@@
-		if err := extractZipFile(f, destPath); err != nil {
-			// Non-fatal — skip individual files that fail
-			continue
-		}
+		if err := extractZipFile(f, destPath); err != nil {
+			if firstExtractErr == nil {
+				firstExtractErr = err
+			}
+			continue
+		}
+		extractedCount++
 	}
 
+	if extractedCount == 0 && firstExtractErr != nil {
+		_ = os.RemoveAll(destDir)
+		return "", fmt.Errorf("failed to extract archive %s: %w", archivePath, firstExtractErr)
+	}
+
 	return destDir, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/source_cache.go` around lines 75 - 95, The loop over
r.File currently ignores per-file extraction failures and still returns destDir;
modify the code in the extraction routine (the loop using r.File and calling
hasMatchingExtension and extractZipFile) to detect when any extractZipFile call
(or any skipped-for-safety path that you consider a failure) fails, track that
failure (e.g., boolean flag or collect the error), and if any file failed then
remove the partially-extracted destDir and return a non-nil error instead of
returning destDir, nil; ensure you still preserve the zip-slip checks
(filepath.Clean + os.PathSeparator) and only treat true extraction errors as
fatal while keeping per-file extension filtering behavior.
internal/cli/scan.go-313-316 (1)

313-316: ⚠️ Potential issue | 🟠 Major

Fail fast when dependency scanning is requested but resolver is unavailable.

With --scan-dependencies, warning-and-continue can return success without any dependency scan output.

Suggested fix
-			resolver, resolverErr := depRegistry.Get(ecosystem)
-			if resolverErr != nil {
-				log.Warn().Err(resolverErr).Str("ecosystem", ecosystem).Msg("No resolver for ecosystem, skipping dependency scan")
-			} else {
+			resolver, resolverErr := depRegistry.Get(ecosystem)
+			if resolverErr != nil {
+				return fmt.Errorf("dependency scanning requested but ecosystem %q is unsupported: %w", ecosystem, resolverErr)
+			} else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/cli/scan.go` around lines 313 - 316, The current code warns and
continues when depRegistry.Get(ecosystem) returns an error (resolverErr), which
allows a successful run despite requested dependency scanning; change this to
fail-fast: if the scan-dependencies flag is set (the CLI option variable used in
this file, e.g., opts.ScanDependencies / scanDependencies), then when
depRegistry.Get(ecosystem) returns resolverErr you should log the error (use
log.Error().Err(resolverErr).Str("ecosystem", ecosystem).Msg(...)) and return a
non-nil error (or exit non‑zero) from the surrounding function instead of
executing the else branch; update the block around depRegistry.Get, resolverErr
and the caller handling to propagate failure when scanning was requested.
internal/callgraph/python_parser.go-416-423 (1)

416-423: ⚠️ Potential issue | 🟠 Major

self.method() calls lose class context and won’t link to declared methods.

Line 418 builds FunctionID without Type, but class methods are declared with Type=<ClassName>. This breaks intra-class call edges.

🔧 Proposed direction
-			Callee:   FunctionID{Package: analysis.PackagePath, Name: method},
+			Callee:   FunctionID{Package: analysis.PackagePath, Type: currentClass, Name: method},

You’ll need to thread currentClass from parseFunctionDefextractCallsparseCallExprparseAttributeCall.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/python_parser.go` around lines 416 - 423, The code that
builds a FunctionCall for self.method() ignores class context (when object ==
pythonSelfObjectName) by creating a FunctionID without its Type; update the
creation of FunctionID to set Type to the current class name (e.g., Type:
currentClass) so intra-class calls link to declared methods, and thread
currentClass through parseFunctionDef → extractCalls → parseCallExpr →
parseAttributeCall so parseAttributeCall has access to currentClass when
constructing the FunctionCall (FunctionCall, FunctionID, object,
pythonSelfObjectName, method, raw, filePath, line, analysis.PackagePath).
internal/callgraph/builder.go-95-98 (1)

95-98: ⚠️ Potential issue | 🟠 Major

Fail fast on unreadable directories.

Line 95 currently logs and continues; that silently drops whole subtrees from analysis. Return the error so partial graphs are not treated as successful scans.

🔧 Proposed fix
 	entries, readErr := os.ReadDir(dir)
 	if readErr != nil {
-		log.Debug().Err(readErr).Str("dir", dir).Msg("Failed to read directory during call graph traversal")
+		return readErr
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/builder.go` around lines 95 - 98, The code currently
swallows os.ReadDir errors (entries, readErr := os.ReadDir(dir)) by logging and
continuing, which drops subtrees; instead fail fast: inside the enclosing call
graph traversal function that contains the os.ReadDir(dir) call, replace the
log-only branch (log.Debug().Err(readErr)... ) with returning the error (or a
wrapped error with context) so the traversal returns the readErr to callers;
update any callers/signatures as needed to propagate the error and avoid
treating partial scans as successful.
internal/callgraph/go_parser.go-87-90 (1)

87-90: ⚠️ Potential issue | 🟠 Major

Do not silently drop files that fail parsing.

Line 87 swallows parse errors with no signal, which can make call-graph gaps hard to detect and debug.

🔧 Proposed fix
 		analysis, err := p.ParseFile(fullPath, packagePath)
 		if err != nil {
-			// Log and skip files that can't be parsed
-			continue
+			return nil, fmt.Errorf("parsing Go file %s: %w", fullPath, err)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/go_parser.go` around lines 87 - 90, The parsing loop in
go_parser.go currently swallows parse errors (the if err != nil { ... continue }
block); update it to surface or log the error instead of silently skipping:
capture the error from the parse call (err), and either log it with context
(e.g. file path and error) using the package logger or return/accumulate the
error from the surrounding parse function (e.g. ParseDirectory/ParseFiles) so
callers can detect failures; ensure you reference the existing err variable and
the parse call in the loop (the block that currently does "if err != nil {
continue }") when adding the logging/propagation.
internal/dependency/maven_resolver.go-117-125 (1)

117-125: ⚠️ Potential issue | 🟠 Major

Root module format should match dependency module keys.

Line 125 returns only groupId, while dependency nodes are groupId:artifactId; this makes root/module correlation inconsistent.

🔧 Proposed fix
 	groupID := pom.GroupID
 	if groupID == "" {
 		groupID = pom.Parent.GroupID
 	}
-	if groupID == "" {
+	if groupID == "" || pom.ArtifactID == "" {
 		return "", fmt.Errorf("cannot determine groupId from pom.xml")
 	}
 
-	return groupID, nil
+	return groupID + ":" + pom.ArtifactID, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/maven_resolver.go` around lines 117 - 125, The function
currently returns only the groupId (variable groupID) which breaks correlation
because dependency keys use "groupId:artifactId"; update the return to construct
and return the module key in the same format by grabbing artifactId from
pom.ArtifactID (falling back to pom.Parent.ArtifactID if empty) and returning
fmt.Sprintf("%s:%s", groupID, artifactID) (adjusting error handling if
artifactId is missing). Ensure you update the variable names and the final
return so the root module format matches dependency module keys.
internal/callgraph/java_parser.go-528-537 (1)

528-537: ⚠️ Potential issue | 🟠 Major

The wildcard import loop returns on the first prefix only, causing systematic misattribution with multiple imports.

The for loop at lines 528–537 contains a return statement inside the loop body, so it never checks beyond the first wildcard import. The comment "try each prefix" does not match the actual behavior. In Java files with multiple wildcard imports (e.g., import java.security.* and import javax.crypto.*), any unresolved class will be incorrectly attributed to the first prefix, creating wrong call graph edges.

The proposed fix is sound: restrict the wildcard attribution to the single-import case only, and fall back to same-package attribution when ambiguous.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` around lines 528 - 537, The loop over
analysis.WildcardImports currently returns a FunctionID on the first prefix,
misattributing classes when multiple wildcard imports exist; change the logic so
you only attribute via a wildcard prefix when there is exactly one wildcard
import: check len(analysis.WildcardImports) == 1 and then return
FunctionID{Package: analysis.WildcardImports[0], Type: simpleClass, Name:
method}; otherwise do not return from the loop (remove the early return) and
allow the existing fallback (same-package attribution) to run. Ensure references
to analysis.WildcardImports, FunctionID, simpleClass, and method are used as
shown so the correct code path is invoked.
internal/dependency/pip_resolver.go-195-195 (1)

195-195: ⚠️ Potential issue | 🟠 Major

Ensure consistent Python interpreter across all package resolution methods.

Line 195 (pipList) and line 231 (pipShow) use bare pip command, while line 348 (pythonPackagesDistributions) uses python3 -c. If pip in PATH resolves to a different Python interpreter, package metadata and import mapping will diverge.

🔧 Proposed changes
-cmd := exec.CommandContext(ctx, "pip", "list", "--format=json")
+cmd := exec.CommandContext(ctx, "python3", "-m", "pip", "list", "--format=json")

Apply the same python3 -m pip approach in pipShow to ensure all metadata comes from the same Python interpreter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/pip_resolver.go` at line 195, The pip invocation is
inconsistent: pipList and pipShow call the bare "pip" while
pythonPackagesDistributions uses "python3 -c", which can mix interpreters;
update pipList and pipShow to invoke pip via the same Python interpreter (use
exec.CommandContext(ctx, "python3", "-m", "pip", ...)) so both functions use the
same interpreter as pythonPackagesDistributions; locate the exec.CommandContext
calls in functions pipList and pipShow and replace the command/name and args
accordingly while preserving existing flags like "--format=json" and "show".
internal/callgraph/java_parser.go-318-336 (1)

318-336: ⚠️ Potential issue | 🟠 Major

Method parameters are not collected in varTypes.

In parseMethodDecl and parseConstructorDecl, the formal_parameters node is skipped during extraction. As a result, when collectVarTypes scans the method body, it only captures field and local variable declarations. Calls on method parameters—like service.encrypt() where service is a parameter—cannot be type-resolved by resolveCallee since parameter types never enter the varTypes map. This gaps call-resolution coverage for parameter-based object usage, which is common in Java for dependency injection and crypto operations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` around lines 318 - 336, collectVarTypes
currently only adds field and local variable declarations, so method/constructor
parameters never populate varTypes; update the logic to also walk
formal_parameters/formal_parameter nodes (from
parseMethodDecl/parseConstructorDecl) and extract each parameter's type and
identifier using the same helper flow (e.g., call extractDeclTypeName or similar
on the parameter node and use javaNodeIdentifier for the name), handling
varargs/annotations/modifiers and skipping empty type names, then add entries
into varTypes so resolveCallee can resolve parameter-based calls like
service.encrypt().
internal/dependency/maven_resolver.go-190-194 (1)

190-194: ⚠️ Potential issue | 🟠 Major

Version parsing fails when Maven coordinates include a classifier.

The function comment on line 173 documents the format as groupId:artifactId:type:version:scope, but mvn dependency:list can output classifiers between type and version: groupId:artifactId:type:classifier:version:scope. Line 193's hardcoded assumption version := parts[3] will extract the classifier (or wrong field) when classifiers are present. Use parts[len(parts)-2] to correctly extract version regardless of classifier presence.

🔧 Proposed fix
 		groupID := parts[0]
 		artifactID := parts[1]
-		// parts[2] is type (jar, pom, etc.)
-		version := parts[3]
+		// support both:
+		// group:artifact:type:version:scope
+		// group:artifact:type:classifier:version:scope
+		version := parts[len(parts)-2]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/maven_resolver.go` around lines 190 - 194, The version
extraction assumes the 4th token (version := parts[3]) but breaks when a
classifier is present; update the code that assigns version (the variable
"version" near where groupID and artifactID are set) to use the penultimate
field (e.g., parts[len(parts)-2]) so version is taken regardless of optional
classifier tokens, and add a brief comment noting that version is taken from
parts[len(parts)-2] to handle both formats from mvn dependency:list.
internal/callgraph/types.go-12-20 (1)

12-20: ⚠️ Potential issue | 🟠 Major

Function identifiers are not unique enough for overloaded methods and duplicate declarations.

With only Package, Type, and Name, distinct declarations with identical signatures collapse to the same key and overwrite each other in the graph map (builder.go:90). In Java and other languages supporting method overloading, this causes data loss and incomplete call graphs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/types.go` around lines 12 - 20, The FunctionID struct
(fields Package, Type, Name) is insufficiently unique for overloaded methods;
add a new field (e.g., Params string or Sig string) that captures the function
signature (parameter types and optionally return types) and populate it where
FunctionID instances are constructed, then update all usages that index or key
functions (e.g., the call graph map in builder.go and any helper methods that
build FunctionID) to use the augmented struct (or its canonical key method like
FuncKey()) so each overloaded/duplicate declaration gets a unique key; also
add/update any equality/hash/String() helpers to produce the new unique key.
internal/callgraph/go_parser.go-243-246 (1)

243-246: ⚠️ Potential issue | 🟠 Major

Receiver type is overwritten by the regular parameter list.

Tree-sitter-go parses method_declaration with two separate parameter_list nodes: one for the receiver and one for parameters. The current code processes both without distinguishing them, so the second parameter_list overwrites the receiver, resulting in incorrect FunctionID.Type and broken method identity for call resolution.

🔧 Proposed fix
 		case "parameter_list":
-			// This is the receiver parameter list
-			receiver = p.extractReceiverType(child, src)
+			// method_declaration has receiver + parameters; keep the first receiver only
+			if receiver == "" {
+				receiver = p.extractReceiverType(child, src)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/go_parser.go` around lines 243 - 246, The parameter_list
case in go_parser.go currently calls p.extractReceiverType for every
parameter_list and thus overwrites the receiver when the regular parameter list
is processed; change the code so you only extract the receiver once: in the
"parameter_list" case, check whether the local receiver variable is empty/nil
(or whether a receiver has already been set) before calling
p.extractReceiverType, and skip extraction if a receiver is already present;
reference the receiver local variable, the switch case "parameter_list" and the
p.extractReceiverType function to locate and update the logic so the second
parameter_list does not overwrite the receiver.
internal/callgraph/rust_parser.go-73-76 (1)

73-76: ⚠️ Potential issue | 🟠 Major

Return parse failures instead of silently dropping files.

Continuing on parse errors makes the call graph incomplete without any signal to callers.

🔧 Suggested fix
 func (p *RustParser) ParseDirectory(dir, packagePath string) ([]*FileAnalysis, error) {
 	entries, err := os.ReadDir(dir)
 	if err != nil {
 		return nil, fmt.Errorf("reading directory %s: %w", dir, err)
 	}
 
 	analyses := make([]*FileAnalysis, 0, len(entries))
+	var firstErr error
 	for _, entry := range entries {
 		if entry.IsDir() {
 			continue
 		}
@@
 		fullPath := filepath.Join(dir, name)
 		analysis, err := p.parseFile(fullPath, packagePath)
 		if err != nil {
+			if firstErr == nil {
+				firstErr = err
+			}
 			continue
 		}
 		analyses = append(analyses, analysis)
 	}
 
-	return analyses, nil
+	return analyses, firstErr
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` around lines 73 - 76, The loop currently
swallows parse errors by continuing on error from p.parseFile(fullPath,
packagePath); change this to return the error to callers (e.g., return nil,
fmt.Errorf(...) or wrap the error) so parse failures surface instead of silently
dropping files; update the surrounding function signature and any callers of the
function containing the loop to propagate or handle the returned error
(reference the p.parseFile call and the enclosing function that iterates files)
so the call graph generation fails fast on parse errors rather than producing
incomplete results.
internal/engine/dependency_scanner.go-548-563 (1)

548-563: ⚠️ Potential issue | 🟠 Major

Normalize dependency paths before user-target paths when dep != nil.

Current order can classify dependency files as user files when dependency directories are nested under the scan target, producing misleading call-chain attribution.

🔧 Suggested fix
-		// Try user target first
-		if rel, err := filepath.Rel(absUserTarget, absPath); err == nil && !strings.HasPrefix(rel, "..") {
-			result[i].FilePath = rel
-			continue
-		}
-
-		// Try dependency dir
+		// Try dependency dir first when dep context is available
 		if dep != nil {
 			absDepDir := dep.Dir
 			if absDepDirResolved, err := filepath.Abs(dep.Dir); err == nil {
 				absDepDir = absDepDirResolved
 			}
 
 			if rel, err := filepath.Rel(absDepDir, absPath); err == nil && !strings.HasPrefix(rel, "..") {
 				result[i].FilePath = dep.Module + "@" + dep.Version + "/" + rel
 				continue
 			}
 		}
+
+		// Then try user target
+		if rel, err := filepath.Rel(absUserTarget, absPath); err == nil && !strings.HasPrefix(rel, "..") {
+			result[i].FilePath = rel
+			continue
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/dependency_scanner.go` around lines 548 - 563, The code
currently checks the user-target path (absUserTarget) before the dependency
directory, causing files under nested dependency dirs to be misattributed;
update the logic so that when dep != nil you normalize/resolve absDepDir and
attempt the filepath.Rel check against absDepDir first (setting
result[i].FilePath = dep.Module+"@"+dep.Version+"/"+rel on success) and only if
that fails fall back to checking against absUserTarget (setting
result[i].FilePath = rel); use the existing variables absDepDir,
absDepDirResolved, absUserTarget, absPath, dep.Module, dep.Version and preserve
the filepath.Abs and filepath.Rel error handling and the same continue
semantics.
internal/callgraph/rust_parser.go-397-403 (1)

397-403: ⚠️ Potential issue | 🟠 Major

Fix local Type::method() calls to use current package + type instead of treating prefix as package path.

The fallback at lines 397–403 incorrectly sets Package=prefix and leaves Type empty. When resolving local associated method calls like SomeType::method(), declarations are keyed with Package=current_package and Type=type_name, so the fallback creates mismatched FunctionIDs that fail to match their declarations.

The fix should distinguish between unqualified type names (no ::) and fully qualified module paths. Unqualified names like SomeType::method are local type methods and should use Package=analysis.PackagePath and Type=prefix, matching how declarations are stored (and matching the pattern used in parseFieldCall at line 449).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` around lines 397 - 403, The fallback in
the FunctionCall construction incorrectly treats prefix as a package path;
change it to detect whether prefix is an unqualified type name (no "::") and,
for local associated method calls, set FunctionID.Package = analysis.PackagePath
and FunctionID.Type = prefix (matching how declarations are keyed and how
parseFieldCall handles it), otherwise keep the fully-qualified behavior
(Package=prefix, Type=""). Update the code that returns &FunctionCall{Callee:
FunctionID{...}} to branch on whether prefix contains module qualifiers and
assign Package and Type accordingly so SomeType::method() resolves to current
package + type name.
internal/engine/dependency_scanner.go-165-168 (1)

165-168: ⚠️ Potential issue | 🟠 Major

Deep-copy reports from cache before mutating.

attributeFindings() directly mutates the report object—rewriting file paths with target-specific prefixes (line 444) and populating call-chain metadata (line 434). When FindingsCache.Get() returns a cached report pointer and subsequent scans with the same rulesHash reuse it, these target-specific mutations persist across scans, causing call chains and file paths from one target to leak into reports for different targets.

Suggested fix
 	for key, report := range depReports {
 		dep := depMap[key]
+		reportCopy := cloneInterimReport(report) // deep copy
+		ds.attributeFindings(reportCopy, dep, opts.ScanOptions.Target, tracer, userPackages)
-		ds.attributeFindings(report, dep, opts.ScanOptions.Target, tracer, userPackages)
 	}

Add a deep-copy helper for entities.InterimReport.

Affects lines 165–168 and 417–445.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/engine/dependency_scanner.go` around lines 165 - 168, The depReports
pulled from FindingsCache.Get are mutated by attributeFindings (which rewrites
paths and call-chain metadata), so before iterating over depReports in the loop
that uses depMap and calls attributeFindings you must deep-copy each
entities.InterimReport to avoid leaking target-specific mutations across scans;
add a deep-copy helper for entities.InterimReport (used right after retrieving
depReports from the cache) and pass the cloned report into attributeFindings
instead of the cached pointer, ensuring all mutations happen on the clone rather
than the cached object.
internal/callgraph/rust_parser.go-126-153 (1)

126-153: ⚠️ Potential issue | 🟠 Major

Prefix propagation breaks wildcard and nested use imports in common Rust patterns.

processUseDecl is always called with an empty prefix (line 120), making the wildcard check ineffective. Additionally, when encountering scoped_use_list or nested lists, the base path is not carried forward—processScopedUseList receives no path parameter and loses context on recursive calls. This causes imports like use ring::*;, use foo::{aead::{Aead}};, and use foo::{Bar, *}; to be missed.

The suggested fix correctly addresses this by:

  1. Adding basePath parameter to processScopedUseList
  2. Propagating it through nested scoped_use_list calls
  3. Handling wildcards inside use_list blocks
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` around lines 126 - 153, processUseDecl
fails to propagate the base module path into nested scoped use lists and
wildcards; update processScopedUseList to accept a basePath (or prefix)
parameter, change calls from processUseDecl to pass the current prefix/base
(including when parsing javaNodeScopedIdentifier-derived pkg), and ensure
processScopedUseList recursively propagates that basePath into nested
"scoped_use_list" and "use_list" entries so identifiers inside blocks get
recorded as pkg+name and wildcard entries (analysis.WildcardImports) use the
combined base path; also update all sites that call processScopedUseList to the
new signature and handle wildcards found inside use_list/ scoped_use_list rather
than relying on the outer prefix only.
internal/callgraph/rust_parser.go-249-255 (1)

249-255: ⚠️ Potential issue | 🟠 Major

Type extraction in trait impl blocks uses wrong type—methods attributed to trait instead of implementor.

In impl Trait for Type { ... }, iterating children and selecting the first type-like identifier picks the trait instead of the concrete implementing type. Methods should be associated with the concrete implementor.

Use node.ChildByFieldName("type") to extract the implementing type directly; tree-sitter-rust defines impl_item with explicit type and trait fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` around lines 249 - 255, The current loop
over node.Child(i) that sets typeName via p.extractTypeName can pick the trait
name in `impl Trait for Type` blocks; change the logic in the impl-item handling
so you first try to get the implementing type directly using
node.ChildByFieldName("type") and pass that child (if non-nil) to
p.extractTypeName to set typeName; only fall back to the existing
child-iteration/p.extractTypeName behavior if the "type" field is absent. Ensure
you update references around the loop that rely on typeName (e.g., where
p.extractTypeName is called) so methods are attributed to the implementor rather
than the trait.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e63f49 and 8f2b5cd.

⛔ Files ignored due to path filters (4)
  • go.sum is excluded by !**/*.sum
  • testdata/projects/cryptowrapper_dep/go.sum is excluded by !**/*.sum
  • testdata/projects/go_with_crypto_dep/go.sum is excluded by !**/*.sum
  • testdata/projects/go_with_dep_chain/go.sum is excluded by !**/*.sum
📒 Files selected for processing (61)
  • .goreleaser.yml
  • CHANGELOG.md
  • Dockerfile
  • Dockerfile.deps
  • Dockerfile.goreleaser.deps
  • Dockerfile.slim
  • Makefile
  • README.md
  • docs/DEPENDENCY_SCANNING.md
  • docs/DOCKER_USAGE.md
  • docs/OUTPUT_FORMATS.md
  • go.mod
  • internal/callgraph/builder.go
  • internal/callgraph/format.go
  • internal/callgraph/format_test.go
  • internal/callgraph/go_parser.go
  • internal/callgraph/java_parser.go
  • internal/callgraph/parser_registry.go
  • internal/callgraph/python_parser.go
  • internal/callgraph/python_parser_test.go
  • internal/callgraph/rust_parser.go
  • internal/callgraph/rust_parser_test.go
  • internal/callgraph/tracer.go
  • internal/callgraph/types.go
  • internal/cli/scan.go
  • internal/cli/scan_test.go
  • internal/deduplicator/deduplicator.go
  • internal/dependency/cargo_resolver.go
  • internal/dependency/cargo_resolver_test.go
  • internal/dependency/go_resolver.go
  • internal/dependency/maven_resolver.go
  • internal/dependency/pip_resolver.go
  • internal/dependency/pip_resolver_test.go
  • internal/dependency/registry.go
  • internal/dependency/resolver.go
  • internal/dependency/source_cache.go
  • internal/engine/dependency_scanner.go
  • internal/engine/findings_cache.go
  • internal/engine/findings_cache_test.go
  • internal/engine/orchestrator.go
  • internal/engine/rule_filter.go
  • internal/entities/interim.go
  • internal/scan/doc.go
  • internal/scan/duration.go
  • internal/scan/ecosystem.go
  • internal/scan/export.go
  • internal/scan/report.go
  • internal/scan/summary.go
  • internal/scan/validation.go
  • schemas/interim-report-schema.json
  • testdata/projects/cryptowrapper_dep/go.mod
  • testdata/projects/cryptowrapper_dep/wrapper.go
  • testdata/projects/go_with_crypto_dep/go.mod
  • testdata/projects/go_with_crypto_dep/main.go
  • testdata/projects/go_with_crypto_dep/mypkg/crypto.go
  • testdata/projects/go_with_dep_chain/go.mod
  • testdata/projects/go_with_dep_chain/main.go
  • testdata/projects/go_with_dep_chain/mypkg/crypto.go
  • testdata/projects/java_with_crypto_dep/pom.xml
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/Main.java
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/service/CryptoService.java

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit 7c9c360

Files with similarities:

  • internal/dependency/registry_test.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 99
  • Detected files undeclared: 1
  • Detected files declared: 98
  • Licenses detected: 5
  • Licenses detected with copyleft: 2
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 4
  • Undeclared components: 0
  • Declared components: 4
  • Detected files: 60
  • Detected files undeclared: 0
  • Detected files declared: 60
  • Licenses detected: 4
  • Licenses detected with copyleft: 2
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (9)
internal/dependency/registry_test.go (1)

15-16: Align mock ecosystem with the registered key for clearer test intent.

noopResolver.Ecosystem() returns "noop" but the test registers it under "go". This works, but it can obscure intent and future consistency checks.

Suggested cleanup
 func (n *noopResolver) Ecosystem() string {
-	return "noop"
+	return "go"
 }

or

-	r.Register("go", resolver)
+	r.Register(resolver.Ecosystem(), resolver)

-	got, err := r.Get("go")
+	got, err := r.Get(resolver.Ecosystem())

Also applies to: 25-27

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/registry_test.go` around lines 15 - 16, The mock
resolver's Ecosystem() value doesn't match the registry key used in tests;
update either noopResolver.Ecosystem() (the method on type noopResolver) to
return "go" to match the test registration key, or change the test registration
key to "noop" so the registered key and the resolver's Ecosystem() are
consistent; pick one approach and apply the same change for the other mock
declarations mentioned (lines around the second occurrence) to keep test intent
and assertions aligned.
internal/cli/root.go (1)

83-91: Remove redundant pterm.DisableColor() call.

The call at line 85 is redundant because line 90 already handles the !ok case within its condition !ok || !term.IsTerminal(stderrFD). When !ok is true, the code reaches line 90 and calls pterm.DisableColor() anyway.

♻️ Proposed simplification
 func Execute() {
 	stderrFD, ok := utils.FDToInt(os.Stderr.Fd())
-	if !ok {
-		pterm.DisableColor()
-	}

 	// Disable pterm colors if not running in a TTY (e.g., piped output or non-interactive terminal)
 	if !ok || !term.IsTerminal(stderrFD) {
 		pterm.DisableColor()
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/cli/root.go` around lines 83 - 91, Remove the redundant
pterm.DisableColor() call by deleting the early conditional that checks ok
alone; keep the single terminal-check block that uses stderrFD, ok :=
utils.FDToInt(os.Stderr.Fd()) and the later if !ok || !term.IsTerminal(stderrFD)
{ pterm.DisableColor() } so color disabling only happens once—update the code to
call pterm.DisableColor() only inside the combined condition that references
utils.FDToInt, stderrFD, ok and term.IsTerminal.
internal/utils/fd_test.go (1)

8-17: Consider adding a boundary test for math.MaxInt.

The current tests cover a normal value and an overflow case. Adding a test for the boundary value math.MaxInt itself would strengthen coverage, ensuring the edge case where fd == math.MaxInt returns (math.MaxInt, true).

💡 Suggested boundary test
 func TestFDToInt(t *testing.T) {
 	if got, ok := FDToInt(uintptr(42)); !ok || got != 42 {
 		t.Fatalf("FDToInt(42) = (%d, %v), want (42, true)", got, ok)
 	}
+
+	// Boundary: MaxInt should succeed
+	if got, ok := FDToInt(uintptr(math.MaxInt)); !ok || got != math.MaxInt {
+		t.Fatalf("FDToInt(MaxInt) = (%d, %v), want (MaxInt, true)", got, ok)
+	}

 	overflow := uintptr(math.MaxInt) + 1
 	if got, ok := FDToInt(overflow); ok || got != 0 {
 		t.Fatalf("FDToInt(overflow) = (%d, %v), want (0, false)", got, ok)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/utils/fd_test.go` around lines 8 - 17, Add a boundary test in
TestFDToInt to assert that passing uintptr(math.MaxInt) to FDToInt returns
(math.MaxInt, true); specifically, update the test in internal/utils/fd_test.go
to call FDToInt(uintptr(math.MaxInt)) and fail the test if the returned value !=
math.MaxInt or ok != true, mirroring the existing assertions for the other cases
and using the same t.Fatalf style so the behavior of FDToInt at the math.MaxInt
boundary is verified.
internal/scanner/interface_spinner_test.go (1)

16-29: Global state mutation may cause flaky tests under parallel execution.

Replacing flag.CommandLine modifies global state that other tests (in this or other packages) may depend on. While the defer restores it, if tests run in parallel (-parallel flag or t.Parallel()), there's a race window where concurrent tests could see inconsistent state.

Consider documenting this test should not run in parallel, or accepting the current approach given Go's default sequential test execution within a package.

💡 Optional: Add comment to clarify sequential execution requirement
 func TestShouldUseSpinner_WithoutTestFlagStillSafe(t *testing.T) {
+	// Note: This test modifies global flag.CommandLine state and must not run
+	// in parallel with other tests that depend on flag package state.
 	orig := flag.CommandLine
 	fs := flag.NewFlagSet("unit", flag.ContinueOnError)
 	flag.CommandLine = fs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scanner/interface_spinner_test.go` around lines 16 - 29, The test
mutates global flag.CommandLine which can race with other tests; protect that
mutation by introducing a package-level sync.Mutex (e.g., flagMu) and acquire
flagMu.Lock() at the start of TestShouldUseSpinner_WithoutTestFlagStillSafe
before replacing flag.CommandLine, deferring flagMu.Unlock() after restoration;
also add a short comment in the test explaining the mutex prevents races when
tests run in parallel and keep references to flag.CommandLine and
scanner.ShouldUseSpinner in the comment to make the intent clear.
internal/dependency/source_cache_test.go (1)

126-135: Consider adding edge case tests.

The current tests cover basic matching but could be more robust with edge cases:

  • Empty extension slice (should return false or true depending on intended behavior)
  • Files without extensions
  • Case sensitivity (.JAVA vs .java)
💡 Optional additional test cases
// Edge cases to consider:
if hasMatchingExtension("Makefile", []string{".java"}) {
    t.Fatal("file without extension should not match")
}
if hasMatchingExtension("Main.java", nil) {
    t.Fatal("nil extensions should not match")
}
if hasMatchingExtension("Main.java", []string{}) {
    t.Fatal("empty extensions should not match")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/source_cache_test.go` around lines 126 - 135, Test
coverage for hasMatchingExtension is missing edge cases; add unit cases in
TestHasMatchingExtension that verify behavior when the extensions slice is nil
and when it is empty (expect false), when the filename has no extension like
"Makefile" (expect false), and ensure case-sensitivity is validated by checking
"Main.JAVA" vs ".java" (assert expected true only if function should be
case-insensitive or false if case-sensitive). Update assertions around
hasMatchingExtension to cover these scenarios and document expected behavior in
the test names to make intent explicit.
internal/entities/interim_additional_test.go (1)

171-177: Consider asserting full asset order, not just the first element.

Checking only index 0 can miss partial ordering regressions. A full-sequence assertion would make this test stricter.

Suggested assertion tightening
 report.SortAssets()
-if report.Findings[0].CryptographicAssets[0].StartLine != 1 {
+if report.Findings[0].CryptographicAssets[0].StartLine != 1 ||
+	report.Findings[0].CryptographicAssets[1].StartLine != 30 {
 	t.Fatalf("assets for first finding were not sorted: %#v", report.Findings[0].CryptographicAssets)
 }
-if report.Findings[1].CryptographicAssets[0].StartLine != 5 {
+if report.Findings[1].CryptographicAssets[0].StartLine != 5 ||
+	report.Findings[1].CryptographicAssets[1].StartLine != 20 {
 	t.Fatalf("assets for second finding were not sorted: %#v", report.Findings[1].CryptographicAssets)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/entities/interim_additional_test.go` around lines 171 - 177, The
test currently only checks the first asset's StartLine after calling
report.SortAssets(), which can miss partial ordering regressions; update the
assertions in internal/entities/interim_additional_test.go to validate the full
ordering of each finding's CryptographicAssets (e.g., confirm the slice of
StartLine values for report.Findings[0].CryptographicAssets and
report.Findings[1].CryptographicAssets is strictly ascending or equals the
expected sequence) rather than only checking index 0, using the existing
report.SortAssets() call and the CryptographicAssets.StartLine field to build
the comparisons.
internal/dependency/cargo_resolver_integration_test.go (1)

137-142: Consider escaping the embedded path in the shell script.

The argsLog path is embedded directly into the shell script string. While t.TempDir() typically returns safe paths, special characters in the path (spaces, quotes) could break the script.

🔧 Safer approach using environment variable
 	argsLog := filepath.Join(binDir, "args.log")
-	writeExecutable(t, binDir, "cargo", `#!/bin/sh
-echo "$@" > "`+argsLog+`"
+	t.Setenv("TEST_ARGS_LOG", argsLog)
+	writeExecutable(t, binDir, "cargo", `#!/bin/sh
+echo "$@" > "$TEST_ARGS_LOG"
 cat <<'JSON'
 {"packages":[],"resolve":{"nodes":[]},"workspace_root":"/tmp"}
 JSON
 `)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/cargo_resolver_integration_test.go` around lines 137 -
142, The embedded argsLog path in the shell script passed to writeExecutable
(see writeExecutable(t, binDir, "cargo", ...)) is inserted raw and can break if
the temp path contains spaces or quotes; update the script to avoid direct
concatenation by either exporting the path as an environment variable (e.g., set
ARGS_LOG before invoking the script and reference it as "$ARGS_LOG" inside the
script) or ensure it is safely quoted/escaped (e.g., use printf '%s' "$ARGS_LOG"
or wrap the value in single quotes and escape internal quotes) so the script
always receives a correct, safely-quoted path when writing argsLog.
internal/dependency/go_resolver_test.go (1)

11-23: Consider shell shim portability only if Windows support is planned.

The tests do use POSIX shell shims (#!/bin/sh), which is not portable to Windows. However, the CI matrix currently runs only on ubuntu-latest, so this is not an issue for the project's current scope. If Windows testing is added to the matrix in the future, revisit this approach—either skip these tests on Windows or use a Go-native helper-process pattern. For now, this is optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/go_resolver_test.go` around lines 11 - 23, Tests create
POSIX shell shims via writeExecutable and assume POSIX PATH handling in
prependPath, which isn't portable to Windows; if Windows CI will be added,
update tests to either skip on Windows (use runtime.GOOS check and t.Skip in
tests that call writeExecutable/prependPath) or replace the shell-based helpers
with a Go helper-process pattern (spawn a Go test helper binary or use
exec.Command with a small Go main) so behavior is consistent across OSes;
otherwise leave as-is for now.
internal/scan/scan_test.go (1)

136-159: Harden dot/text export assertions.

dot and text subtests currently only assert non-empty output. Adding a minimal format signature check would catch formatter routing regressions earlier.

Suggested diff
@@
 			if len(data) == 0 {
 				t.Fatal("export produced empty file")
 			}
 
 			if format == "json" {
 				var payload map[string]any
 				if err := json.Unmarshal(data, &payload); err != nil {
 					t.Fatalf("invalid json output: %v", err)
 				}
+			} else if format == "dot" {
+				if !strings.Contains(string(data), "digraph") {
+					t.Fatalf("dot output missing graph signature: %q", string(data))
+				}
+			} else if format == "text" {
+				if !strings.Contains(string(data), "app.main") {
+					t.Fatalf("text output missing expected function name: %q", string(data))
+				}
 			}
 		})
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/scan_test.go` around lines 136 - 159, The dot/text subtests
only check for non-empty output; update the ExportCallGraph tests to assert
minimal format signatures to detect routing regressions: after reading data in
the subtest for format "dot" ensure the output contains the Graphviz header
(e.g., "digraph" or "strict digraph") and for format "text" assert the output
contains call-edge indicators (e.g., "->" or a known function name from the
generated result). Locate the test loop that calls ExportCallGraph and adds
these additional checks inside the t.Run closure that currently handles formats
"json","dot","text".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Dockerfile.deps`:
- Around line 60-109: The image currently runs as root because there is no USER
directive; create a non-root runtime user (e.g., "scanner" or "crypto") and
switch to it in the final stage: add user creation (useradd or addgroup/adduser)
and set ownership of required runtime directories (WORKDIR /workspace,
/opt/target-venv, any config/cache dirs used by ENTRYPOINT "crypto-finder" and
tools like /root/.cargo or /root/.opengrep) and then add a USER <username> line
before ENTRYPOINT so the container drops privileges at runtime while keeping
files writable by that user.
- Around line 55-77: The Dockerfile uses unpinned remote installers: the RUN
curl ... opengrep/main/install.sh and the rustup invocation (curl ...
sh.rustup.rs) which risks supply-chain changes; fix by replacing the opengrep
URL to a specific release tag (e.g., v1.12.1/install.sh), download the script
first to a file, verify its checksum (sha256) against a checked-in or published
hash before executing, and then run it; likewise download the rustup-init script
to a file, verify its checksum, and invoke it with an explicit toolchain version
via the --default-toolchain flag (e.g., --default-toolchain 1.75.0) instead of
"stable" to pin the Rust version. Ensure these changes touch the two RUN curl
lines (the opengrep install line and the rustup.rs install line) and add
checksum verification steps prior to executing the installers.

In `@internal/callgraph/format.go`:
- Around line 44-45: The exported formatter functions FormatJSON, FormatDOT, and
FormatText currently dereference the graph parameter immediately and will panic
on a nil input; add nil guards at the start of each function (FormatJSON,
FormatDOT, FormatText) to check if graph == nil (or graph.Functions/graph.Edges
as appropriate) and return a clear error (e.g., fmt.Errorf("nil call graph"))
instead of proceeding, ensuring callers receive a proper error rather than a
panic; update any early uses (like creating maps/slices from graph.Functions) to
occur after the nil check.

In `@internal/dependency/maven_resolver.go`:
- Around line 77-85: The loop currently only appends a dependency to
ResolveResult.Dependencies when resolveSourceDir returned a non-empty dep.Dir,
dropping resolved deps without sources; change the logic in the for loop that
iterates over deps so that each dep is always appended to result.Dependencies
regardless of dep.Dir (e.g., append before the check or remove the conditional),
while keeping the existing log.Debug() and continue behavior for source-less
deps; update any code/comments referencing resolveSourceDir, dep.Dir, deps, and
ResolveResult.Dependencies to reflect that dependencies are included even when
Dir == "".
- Around line 187-196: The parser in parseDependencyList incorrectly assumes
version is at parts[3]; modify parseDependencyList to calculate version as
parts[len(parts)-2] (since scope is always last) so it works for both 5-part and
6-part coordinates that include a classifier, and update the parseDependencyList
comment to state that classifier tokens may be present in Maven coordinates;
refer to variables parts, groupID, artifactID and version in your change.

In `@internal/dependency/source_cache_test.go`:
- Around line 59-60: The test sets HOME via t.Setenv("HOME", home) which breaks
on Windows; update the tests (e.g., TestNewSourceCacheAndCachedDir and
TestSourceCache_ExtractZip_InvalidArchive) to also set USERPROFILE when
runtime.GOOS == "windows" so both HOME and USERPROFILE point to the temp dir;
modify the t.Setenv calls around the HOME assignment (and the same pattern in
TestSourceCache_ExtractZip_InvalidArchive) to conditionally set USERPROFILE to
the same value to ensure Windows-compatible environment setup.
- Around line 36-37: The test sets only HOME but os.UserHomeDir() on Windows
ignores HOME, causing CI failures; update the test to also set Windows home env
vars when stubbing the temp home by setting USERPROFILE (and optionally
HOMEDRIVE/HOMEPATH) to the same t.TempDir() value so os.UserHomeDir() picks up
the fake home, i.e., extend the existing t.Setenv("HOME", home) usage to also
set USERPROFILE (and HOMEDRIVE/HOMEPATH) or add a platform-conditional to set
those vars when running on Windows where os.UserHomeDir() is used.
- Around line 88-91: The current test checks for a zip-slip escape to
cache.baseDir/evil/attack.java but the malicious entry "../evil/attack.java"
extracted from cache.baseDir/org.example:lib/1.2.3/ would resolve to
cache.baseDir/org.example:lib/evil/attack.java (one level up), so update the
assertion to verify that filepath.Join(cache.baseDir, "org.example:lib", "evil",
"attack.java") does not exist (and optionally also verify the original intended
path under org.example:lib:1.2.3/attack.java was not created); locate the
existing zipSlip usage and replace the expected path accordingly to ensure the
test fails if zip-slip protection allows a one-level escape.

In `@internal/engine/dependency_scanner_test.go`:
- Around line 361-392: The test mutates scanCalls concurrently in
mockScanner.scanFunc causing a data race; update the test to use a
concurrency-safe counter (e.g., atomic.Int32/atomic.AddInt32 or a
sync.Mutex-wrapped counter) when incrementing scanCalls inside the
mockScanner.scanFunc, so that scanDependenciesParallel's parallel workers safely
increment the counter and the final assertion on scanCalls is deterministic;
locate the counter and the mockScanner.scanFunc in the test and replace the
plain scanCalls++ with an atomic or mutex-protected increment while keeping the
rest of the test (DependencyScanner, scanDependenciesParallel, mockScanner)
unchanged.

---

Nitpick comments:
In `@internal/cli/root.go`:
- Around line 83-91: Remove the redundant pterm.DisableColor() call by deleting
the early conditional that checks ok alone; keep the single terminal-check block
that uses stderrFD, ok := utils.FDToInt(os.Stderr.Fd()) and the later if !ok ||
!term.IsTerminal(stderrFD) { pterm.DisableColor() } so color disabling only
happens once—update the code to call pterm.DisableColor() only inside the
combined condition that references utils.FDToInt, stderrFD, ok and
term.IsTerminal.

In `@internal/dependency/cargo_resolver_integration_test.go`:
- Around line 137-142: The embedded argsLog path in the shell script passed to
writeExecutable (see writeExecutable(t, binDir, "cargo", ...)) is inserted raw
and can break if the temp path contains spaces or quotes; update the script to
avoid direct concatenation by either exporting the path as an environment
variable (e.g., set ARGS_LOG before invoking the script and reference it as
"$ARGS_LOG" inside the script) or ensure it is safely quoted/escaped (e.g., use
printf '%s' "$ARGS_LOG" or wrap the value in single quotes and escape internal
quotes) so the script always receives a correct, safely-quoted path when writing
argsLog.

In `@internal/dependency/go_resolver_test.go`:
- Around line 11-23: Tests create POSIX shell shims via writeExecutable and
assume POSIX PATH handling in prependPath, which isn't portable to Windows; if
Windows CI will be added, update tests to either skip on Windows (use
runtime.GOOS check and t.Skip in tests that call writeExecutable/prependPath) or
replace the shell-based helpers with a Go helper-process pattern (spawn a Go
test helper binary or use exec.Command with a small Go main) so behavior is
consistent across OSes; otherwise leave as-is for now.

In `@internal/dependency/registry_test.go`:
- Around line 15-16: The mock resolver's Ecosystem() value doesn't match the
registry key used in tests; update either noopResolver.Ecosystem() (the method
on type noopResolver) to return "go" to match the test registration key, or
change the test registration key to "noop" so the registered key and the
resolver's Ecosystem() are consistent; pick one approach and apply the same
change for the other mock declarations mentioned (lines around the second
occurrence) to keep test intent and assertions aligned.

In `@internal/dependency/source_cache_test.go`:
- Around line 126-135: Test coverage for hasMatchingExtension is missing edge
cases; add unit cases in TestHasMatchingExtension that verify behavior when the
extensions slice is nil and when it is empty (expect false), when the filename
has no extension like "Makefile" (expect false), and ensure case-sensitivity is
validated by checking "Main.JAVA" vs ".java" (assert expected true only if
function should be case-insensitive or false if case-sensitive). Update
assertions around hasMatchingExtension to cover these scenarios and document
expected behavior in the test names to make intent explicit.

In `@internal/entities/interim_additional_test.go`:
- Around line 171-177: The test currently only checks the first asset's
StartLine after calling report.SortAssets(), which can miss partial ordering
regressions; update the assertions in
internal/entities/interim_additional_test.go to validate the full ordering of
each finding's CryptographicAssets (e.g., confirm the slice of StartLine values
for report.Findings[0].CryptographicAssets and
report.Findings[1].CryptographicAssets is strictly ascending or equals the
expected sequence) rather than only checking index 0, using the existing
report.SortAssets() call and the CryptographicAssets.StartLine field to build
the comparisons.

In `@internal/scan/scan_test.go`:
- Around line 136-159: The dot/text subtests only check for non-empty output;
update the ExportCallGraph tests to assert minimal format signatures to detect
routing regressions: after reading data in the subtest for format "dot" ensure
the output contains the Graphviz header (e.g., "digraph" or "strict digraph")
and for format "text" assert the output contains call-edge indicators (e.g.,
"->" or a known function name from the generated result). Locate the test loop
that calls ExportCallGraph and adds these additional checks inside the t.Run
closure that currently handles formats "json","dot","text".

In `@internal/scanner/interface_spinner_test.go`:
- Around line 16-29: The test mutates global flag.CommandLine which can race
with other tests; protect that mutation by introducing a package-level
sync.Mutex (e.g., flagMu) and acquire flagMu.Lock() at the start of
TestShouldUseSpinner_WithoutTestFlagStillSafe before replacing flag.CommandLine,
deferring flagMu.Unlock() after restoration; also add a short comment in the
test explaining the mutex prevents races when tests run in parallel and keep
references to flag.CommandLine and scanner.ShouldUseSpinner in the comment to
make the intent clear.

In `@internal/utils/fd_test.go`:
- Around line 8-17: Add a boundary test in TestFDToInt to assert that passing
uintptr(math.MaxInt) to FDToInt returns (math.MaxInt, true); specifically,
update the test in internal/utils/fd_test.go to call
FDToInt(uintptr(math.MaxInt)) and fail the test if the returned value !=
math.MaxInt or ok != true, mirroring the existing assertions for the other cases
and using the same t.Fatalf style so the behavior of FDToInt at the math.MaxInt
boundary is verified.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ecb79b and 038f391.

⛔ Files ignored due to path filters (1)
  • coverage_targeted.out is excluded by !**/*.out
📒 Files selected for processing (29)
  • Dockerfile
  • Dockerfile.deps
  • Dockerfile.goreleaser.deps
  • Dockerfile.slim
  • internal/api/client.go
  • internal/callgraph/core_test.go
  • internal/callgraph/format.go
  • internal/callgraph/go_parser_test.go
  • internal/callgraph/java_parser_test.go
  • internal/cli/root.go
  • internal/dependency/cargo_resolver_integration_test.go
  • internal/dependency/go_resolver_test.go
  • internal/dependency/maven_resolver.go
  • internal/dependency/maven_resolver_test.go
  • internal/dependency/registry_test.go
  • internal/dependency/source_cache_test.go
  • internal/engine/dependency_scanner_test.go
  • internal/engine/rule_filter_test.go
  • internal/entities/interim_additional_test.go
  • internal/errors/formatter.go
  • internal/scan/export.go
  • internal/scan/scan_test.go
  • internal/scanner/interface.go
  • internal/scanner/interface_spinner_test.go
  • internal/scanner/registry.go
  • internal/utils/fd.go
  • internal/utils/fd_test.go
  • internal/version/version.go
  • scanoss.json
✅ Files skipped from review due to trivial changes (2)
  • internal/api/client.go
  • internal/callgraph/core_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • Dockerfile.goreleaser.deps
  • scanoss.json

feat: execute dependency scans in parallel, fix rule filtering

feat: improve dependency scanning output, support multiple call chains for the same finding

chore: add testdata for dependency scanning

feat: add java parser

feat: add --export-callgraph flag for scanning dependencies

fix: lint errors

feat(deps-scan): add multi-ecosystem dependency scanning, callgraph support, and deps Docker images

- add dependency scanning for Python and Rust, plus Cargo resolver support
  - new pip-based resolver and tests
  - new cargo metadata resolver and tests
  - extend resolver model with workspace members for monorepo/workspace projects

- expand callgraph capabilities
  - add Python and Rust tree-sitter parsers and tests
  - register Python/Rust parsers in parser registry
  - support richer callchain tracing and report merging for dependency findings

- refactor scan command internals
  - extract scan-specific utilities into internal/scan:
    - validation, duration parsing, ecosystem detection
    - callgraph export, report counting, summary rendering
  - centralize flag validation rules (including export-callgraph dependency check)
  - simplify internal/cli/scan.go orchestration paths

- harden IO/error handling and lint compliance across engine/dependency/callgraph
  - handle close/remove/write errors consistently
  - tighten file/dir permissions where needed
  - reduce copy-heavy loops, preallocate slices, and address complexity/style findings
  - ensure make lint runs with local cache paths reliably

- add/adjust container and release support for dependency scanning
  - add Dockerfile.deps and Dockerfile.goreleaser.deps
  - update Dockerfile/Dockerfile.slim and goreleaser config for deps variants
  - update Make targets for deps image workflows

- update docs and release notes
  - README and docs/DOCKER_USAGE.md for latest-deps usage and dependency scanning guidance
  - CHANGELOG entries for 0.3.0 dependency-scanning additions/fixes

chore: update scanoss.json

chore: fix lint errors, increase test coverage

chore(tests): add more coverage

fix(tests): unit tests and lint errors
@matiasdaloia matiasdaloia force-pushed the feature/mdaloia/dependency-scanning branch from 4861cb3 to 104716b Compare March 3, 2026 10:31
@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 4
  • Undeclared components: 0
  • Declared components: 4
  • Detected files: 60
  • Detected files undeclared: 0
  • Detected files declared: 60
  • Licenses detected: 4
  • Licenses detected with copyleft: 2
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 4
  • Undeclared components: 0
  • Declared components: 4
  • Detected files: 60
  • Detected files undeclared: 0
  • Detected files declared: 60
  • Licenses detected: 4
  • Licenses detected with copyleft: 2
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

github-actions bot commented Mar 9, 2026

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit c526d9a

Files with similarities:

  • internal/callgraph/text_helpers.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

github-actions bot commented Mar 9, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 61
  • Detected files undeclared: 1
  • Detected files declared: 60
  • Licenses detected: 5
  • Licenses detected with copyleft: 3
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/OUTPUT_FORMATS.md (2)

148-152: ⚠️ Potential issue | 🟡 Minor

Invalid JSON: trailing comma after array.

The example has a trailing comma after cryptographic_assets closing bracket (line 150), which is invalid JSON syntax.

Suggested fix
           "algorithmPadding": "PKCS5Padding"
           }
         }
-      ],
+      ]
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/OUTPUT_FORMATS.md` around lines 148 - 152, Remove the trailing comma
after the cryptographic_assets array in the example JSON so it becomes valid
JSON; locate the example block that contains the "cryptographic_assets" array
(the closing bracket ]) and delete the comma immediately following that bracket,
then validate the JSON structure around the outer objects/arrays to ensure no
other trailing commas remain.

193-197: ⚠️ Potential issue | 🟡 Minor

Invalid JSON: trailing comma after array.

Same issue as above — trailing comma after cryptographic_assets closing bracket (line 195).

Suggested fix
           "algorithmMode": "GCM"
           }
         }
-      ],
+      ]
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/OUTPUT_FORMATS.md` around lines 193 - 197, The JSON contains an invalid
trailing comma after the closing bracket of the cryptographic_assets array;
remove the trailing comma following the cryptographic_assets closing bracket so
the array/object is properly terminated and the block becomes valid JSON (ensure
no other arrays/objects in the same structure have trailing commas as well).
♻️ Duplicate comments (1)
Dockerfile.deps (1)

54-77: ⚠️ Potential issue | 🟠 Major

Verify and pin the remote installers before executing them.

This previous review concern is still open: Line 55 and Lines 76-77 both execute remote scripts without integrity verification, and the Rust install still tracks the mutable stable channel. That keeps the deps image non-reproducible and exposed to upstream script changes.

🔒 Hardening sketch
- RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.12.1/install.sh | bash
+ RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.12.1/install.sh -o /tmp/opengrep-install.sh && \
+     echo "<sha256>  /tmp/opengrep-install.sh" | sha256sum -c - && \
+     bash /tmp/opengrep-install.sh && \
+     rm -f /tmp/opengrep-install.sh
@@
- RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | \
-     sh -s -- -y --default-toolchain stable --profile minimal
+ RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs -o /tmp/rustup-init.sh && \
+     echo "<sha256>  /tmp/rustup-init.sh" | sha256sum -c - && \
+     sh /tmp/rustup-init.sh -y --default-toolchain <pinned-version> --profile minimal && \
+     rm -f /tmp/rustup-init.sh
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.deps` around lines 54 - 77, The Dockerfile runs remote installers
without integrity checks and uses the mutable Rust "stable" channel; download
the OpenGrep install.sh and the rustup installer to files first, verify each
against a pinned checksum or PGP signature before executing, and fail the build
if verification fails; additionally, replace the mutable Rust channel by
specifying a concrete toolchain version for rustup (e.g., a specific release
string) and use the local installer file for installation. Target the RUN curl |
bash installing OpenGrep and the RUN curl ... sh -s -- -y ... invoking rustup to
apply these changes.
🟡 Minor comments (6)
internal/dependency/source_cache.go-108-144 (1)

108-144: ⚠️ Potential issue | 🟡 Minor

Directory entries in zip archives are not handled.

If the zip contains directory entries (files ending with / or with FileInfo.IsDir()), this will attempt to create a file at that path instead of a directory. While os.MkdirAll on line 109 creates parent directories, an explicit directory entry could cause issues.

🛠️ Proposed fix to skip directory entries
 func extractZipFile(f *zip.File, destPath string) error {
+	if f.FileInfo().IsDir() {
+		return os.MkdirAll(destPath, 0o750)
+	}
+
 	if err := os.MkdirAll(filepath.Dir(destPath), 0o750); err != nil {
 		return err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/source_cache.go` around lines 108 - 144, The
extractZipFile function should skip ZIP directory entries instead of trying to
create files for them; check the zip.File (use f.FileInfo().IsDir() or
strings.HasSuffix(f.Name, "/")) at the start of extractZipFile and return nil
immediately for directory entries, while still ensuring any opened reader (rc)
is closed if opened; keep the existing size checks and copy logic for regular
files (references: extractZipFile, f.Open, f.Name, maxExtractedFileSize) and
ensure rc and out are closed on all early returns.
internal/callgraph/go_parser.go-102-105 (1)

102-105: ⚠️ Potential issue | 🟡 Minor

SubPackagePath doesn't handle empty parentPath.

When parentPath is empty, this returns "/dirName" with a leading slash, which may produce invalid import paths. The Rust parser (lines 40-43) handles this edge case explicitly.

Suggested fix
 func (p *GoParser) SubPackagePath(parentPath, dirName string) string {
+	if parentPath == "" {
+		return dirName
+	}
 	return parentPath + "/" + dirName
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/go_parser.go` around lines 102 - 105, SubPackagePath
currently concatenates parentPath + "/" + dirName which yields a leading slash
when parentPath is empty; update the GoParser.SubPackagePath implementation to
check if parentPath is empty and in that case return dirName (or dirName as-is),
otherwise return parentPath + "/" + dirName — reference the function name
SubPackagePath and parameters parentPath and dirName to locate and modify the
code.
internal/callgraph/rust_parser.go-514-516 (1)

514-516: ⚠️ Potential issue | 🟡 Minor

self detection may produce false positives.

Using strings.Contains(clean, "self") could match parameter names like myself or selfie. Consider a more precise check.

Suggested fix
-		if strings.Contains(clean, "self") {
+		if clean == "self" || clean == "&self" || clean == "&mut self" || strings.HasPrefix(clean, "self:") || strings.HasPrefix(clean, "&self:") || strings.HasPrefix(clean, "&mut self:") {
 			hasSelf = true
 		}

Or use a regex pattern like ^\s*(&\s*)?(mut\s+)?self\b.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` around lines 514 - 516, The current check
using strings.Contains(clean, "self") can yield false positives; update the
logic that sets hasSelf (the branch inspecting the variable clean) to detect
self as a standalone parameter token instead of a substring — e.g., use a
regular-expression match that allows optional leading '&' and optional 'mut'
then the word "self" as a whole word (or alternatively split the parameter
tokens and compare them exactly) and replace the strings.Contains call with that
stricter test so names like "myself" or "selfie" no longer trigger hasSelf.
README.md-186-187 (1)

186-187: ⚠️ Potential issue | 🟡 Minor

Document the actual call-graph export formats.

This says --export-callgraph exports JSON only, but the PR objective adds JSON/DOT/Text outputs. Please document the accepted formats, or the option/file-extension behavior that selects them, so users don’t miss supported outputs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 186 - 187, Update the CLI docs for the
--export-callgraph option to list the supported export formats (JSON, DOT, and
plain text) and describe how the format is selected (either via an explicit
--format=<json|dot|text> flag or inferred from the output filename extension
such as .json/.dot/.txt). Also note default behavior (JSON if unspecified) and
any differences in content/structure for each format (e.g., JSON contains
structured nodes/edges, DOT is Graphviz-compatible, text is human-readable
adjacency list). Ensure this description references the --export-callgraph
option name so users can find it quickly in the README.
internal/scan/scan_test.go-204-206 (1)

204-206: ⚠️ Potential issue | 🟡 Minor

Use a temp path instead of /tmp here.

This makes the test Unix-specific even though PrintSummary itself is platform-agnostic. Building the path from t.TempDir() keeps the test portable.

Proposed fix
-	if err := PrintSummary("/tmp/out.json", 1, 2); err != nil {
+	out := filepath.Join(t.TempDir(), "out.json")
+	if err := PrintSummary(out, 1, 2); err != nil {
 		t.Fatalf("PrintSummary(custom path): %v", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/scan_test.go` around lines 204 - 206, The test currently calls
PrintSummary("/tmp/out.json", 1, 2) which hardcodes a Unix-only path; change it
to use t.TempDir() to build a portable temp file path (e.g.,
filepath.Join(t.TempDir(), "out.json")) before calling PrintSummary so the test
works on Windows and other platforms; update the call site in scan_test.go where
PrintSummary is invoked and ensure any imports (testing, path/filepath) are
present.
docs/DEPENDENCY_SCANNING.md-824-826 (1)

824-826: ⚠️ Potential issue | 🟡 Minor

Update JSON schema version example to reflect current v1.3

The JSON output example at line 536 shows "version": "1.2", but the schema documentation at lines 824-826 describes the current Schema v1.3. Update the example to use "version": "1.3" to match the documented current version.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/DEPENDENCY_SCANNING.md` around lines 824 - 826, The JSON example in
docs/DEPENDENCY_SCANNING.md still shows the schema "version" field as "1.2" but
the document title and description reference Schema (v1.3); update the example's
JSON "version" value to "1.3" so the example matches the documented Schema
(v1.3). Locate the example by searching for the JSON key "version": "1.2" near
the "Schema (v1.3)" section and replace it with "version": "1.3".
🧹 Nitpick comments (23)
internal/scanner/opengrep/scanner.go (1)

83-83: Derive the install hint from MinimumVersion.

These messages hardcode v1.12.1 instead of reusing MinimumVersion, so the upgrade guidance can drift the next time the minimum is bumped.

♻️ Proposed fix
 const MinimumVersion = "1.12.1"
+
+func installHint() string {
+	return fmt.Sprintf(
+		"curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v%s/install.sh | bash",
+		MinimumVersion,
+	)
+}
@@
-		return fmt.Errorf("opengrep not found in PATH: %w (install with: curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.12.1/install.sh | bash)", err)
+		return fmt.Errorf("opengrep not found in PATH: %w (install with: %s)", err, installHint())
@@
-		return fmt.Errorf("opengrep version %s is below minimum required version %s (upgrade with: curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.12.1/install.sh | bash)", s.version, MinimumVersion)
+		return fmt.Errorf("opengrep version %s is below minimum required version %s (upgrade with: %s)", s.version, MinimumVersion, installHint())

Also applies to: 199-199

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scanner/opengrep/scanner.go` at line 83, The error messages hardcode
"v1.12.1" instead of using the shared MinimumVersion constant; update the
fmt.Errorf calls that build the install hint (both occurrences around the
opengrep not found and the related error at the other location) to derive the
version from MinimumVersion (e.g., format the URL/label using MinimumVersion) so
the hint always reflects the current minimum; search for fmt.Errorf(...) strings
containing "install with: curl -fsSL
https://raw.githubusercontent.com/opengrep/opengrep/" and replace the literal
version with a formatted reference to MinimumVersion.
internal/callgraph/text_helpers.go (1)

5-97: Split the comma-splitting state machine into smaller helpers.

Line 5 has already crossed the current gocognit limit, and future parser tweaks will be hard to validate in one large function. Pulling string-state handling and delimiter-depth updates into small helpers would make this much easier to reason about and test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/text_helpers.go` around lines 5 - 97, The
splitTopLevelCommaList function is too large—extract the string-state handling
and delimiter-depth updates into small helper functions to reduce cognitive
complexity: create helpers like handleStringState(r rune, inSingle, inDouble,
inBacktick, escapeNext bool) (inSingle, inDouble, inBacktick, escapeNext bool,
consumed bool) to encapsulate the logic that toggles quote/backtick/escape state
and whether the rune should be written/consumed; and updateDepthForRune(r rune,
parenDepth, bracketDepth, braceDepth, angleDepth int) (parenDepth, bracketDepth,
braceDepth, angleDepth int, isSeparator bool) to encapsulate bracket depth
changes and detect top-level commas; keep flush() as-is but call these helpers
from splitTopLevelCommaList, passing and returning the state variables so the
main loop becomes a thin coordinator calling handleStringState and
updateDepthForRune and writing the rune or flushing when isSeparator is true.
.gitignore (1)

6-6: Anchor the new ignore rule to the repository root.

Line 6 will ignore any directory named bin anywhere in the tree, not just the top-level tool cache. If this PR only needs ./bin ignored, use /bin/ so nested fixtures or sample projects stay visible to Git.

Suggested diff
-bin/
+/bin/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore at line 6, The .gitignore entry currently uses the pattern "bin/"
which matches any directory named bin anywhere in the tree; change that entry to
anchor it to the repository root by replacing "bin/" with "/bin/" so only the
top-level ./bin directory is ignored and nested bin directories remain tracked.
Ensure the pattern string "bin/" is updated to "/bin/" in the .gitignore file.
internal/callgraph/types.go (1)

97-104: Honor sep or remove it from the ParseFunctionID contract.

The doc and signature advertise separator-aware parsing, but Line 104 ignores the second argument entirely. That makes the API misleading for future parser call sites.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/types.go` around lines 97 - 104, The ParseFunctionID
function signature and docs advertise a separator parameter but the
implementation ignores its second argument; either make ParseFunctionID actually
use the sep parameter when finding the last separator (replace the hardcoded "."
split logic with a search for the provided sep and update the parsing of
package/type vs name accordingly) or remove the sep parameter from the signature
and documentation altogether (and update all callers/tests to the new
signature). Ensure the docstring and the function body agree with the chosen
contract and reference the ParseFunctionID symbol in your edits.
Dockerfile.test (1)

20-21: Consider using OpenGrep's official installer with signature verification.

Line 21 pipes a remote script into bash. While the version tag is pinned (making it immutable on GitHub), OpenGrep's official installer supports signature verification via Cosign, which is preferable for production-grade reproducibility. For a test Dockerfile, the current approach is acceptable, but you can improve it by following the official pattern:

RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/main/install.sh | bash -s -- -v v1.12.1 --verify-signatures

This uses the official install.sh from main with explicit version pinning and Cosign signature verification against GitHub's OIDC issuer. Note that SHA256 checksums are not published for this release; OpenGrep relies on Cosign for integrity verification instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.test` around lines 20 - 21, Replace the direct pipe of a
tag-pinned install script with the official installer invocation that enables
Cosign signature verification: update the RUN that calls install.sh to use the
main branch installer and pass the explicit version and verify flag (e.g. bash
-s -- -v v1.12.1 --verify-signatures) so the install step runs install.sh with
signature verification enabled rather than blindly executing the tag URL; locate
the current RUN curl ... | bash line and modify it to call the main install.sh
with the -v and --verify-signatures options.
internal/dependency/go_resolver.go (1)

125-139: Graph may contain duplicate edges.

If go mod graph outputs duplicate lines (same parent→child pair), the graph will contain duplicates in the adjacency list. This is likely benign but worth noting.

♻️ Optional: Deduplicate graph edges
 	graph := make(map[string][]string)
+	seen := make(map[string]map[string]bool)
 	for _, line := range strings.Split(stdout.String(), "\n") {
 		line = strings.TrimSpace(line)
 		if line == "" {
 			continue
 		}
 		parts := strings.Fields(line)
 		if len(parts) != 2 {
 			continue
 		}
 		// Strip version from module paths for cleaner lookup
 		parent := stripVersion(parts[0])
 		child := stripVersion(parts[1])
+		if seen[parent] == nil {
+			seen[parent] = make(map[string]bool)
+		}
+		if seen[parent][child] {
+			continue
+		}
+		seen[parent][child] = true
 		graph[parent] = append(graph[parent], child)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/go_resolver.go` around lines 125 - 139, The adjacency
lists built into the graph map in go_resolver.go can accumulate duplicate edges
when parsing the go mod graph output; update the loop that processes lines
(using the existing graph variable and stripVersion(parts[0]/parts[1]) usage) to
deduplicate before appending—e.g., track seen children per parent (a
map[string]map[string]bool or check the slice for existence) and only append
child to graph[parent] when not already present so each parent→child appears
once.
internal/dependency/pip_resolver.go (1)

229-229: Consider using append idiom to avoid potential issues with slice modification.

Using append([]string{"show"}, batch...) creates a new slice, which is fine, but using the more explicit form is clearer.

♻️ Optional: Use explicit slice construction
-		args := append([]string{"show"}, batch...)
+		args := make([]string, 0, len(batch)+1)
+		args = append(args, "show")
+		args = append(args, batch...)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/pip_resolver.go` at line 229, Replace the inline append
construction for args with an explicit pre-sized slice to avoid slice aliasing
surprises: instead of args := append([]string{"show"}, batch...), allocate args
with make using length 1+len(batch), set args[0] = "show", then copy(batch,
args[1:]) (reference the args variable and the batch slice where args is
currently built).
internal/dependency/pip_resolver_test.go (1)

208-212: Consider checking os.MkdirAll and os.WriteFile errors in test setup.

While test failures will likely surface issues, explicitly checking errors in test setup provides clearer diagnostics.

♻️ Optional: Add error checks for clearer test failures
 	// Create directory packages
-	os.MkdirAll(filepath.Join(sitePackages, "bs4"), 0o755)
-	os.MkdirAll(filepath.Join(sitePackages, "PIL"), 0o755)
+	if err := os.MkdirAll(filepath.Join(sitePackages, "bs4"), 0o755); err != nil {
+		t.Fatalf("setup: %v", err)
+	}
+	if err := os.MkdirAll(filepath.Join(sitePackages, "PIL"), 0o755); err != nil {
+		t.Fatalf("setup: %v", err)
+	}

 	// Create a single-file module
-	os.WriteFile(filepath.Join(sitePackages, "six.py"), []byte("# six"), 0o644)
+	if err := os.WriteFile(filepath.Join(sitePackages, "six.py"), []byte("# six"), 0o644); err != nil {
+		t.Fatalf("setup: %v", err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/pip_resolver_test.go` around lines 208 - 212, Wrap the
calls to os.MkdirAll and os.WriteFile in error checks and fail the test on error
so setup failures produce clear diagnostics: check the returned error from each
os.MkdirAll(filepath.Join(sitePackages, "bs4"), ...),
os.MkdirAll(filepath.Join(sitePackages, "PIL"), ...), and
os.WriteFile(filepath.Join(sitePackages, "six.py"), ...) and call t.Fatalf (or
t.Fatalff) with a helpful message including the error and the path when non-nil;
this keeps the setup in pip_resolver_test.go (references: sitePackages,
os.MkdirAll, os.WriteFile) deterministic and provides immediate test failure on
setup errors.
internal/dependency/cargo_resolver.go (1)

83-86: First workspace crate becomes RootModule - verify this is intentional.

In workspaces with multiple crates, the iteration order over meta.Packages may not be deterministic (depends on cargo metadata output order). Consider documenting this behavior or using a more explicit heuristic.

📝 Consider documenting or improving root module selection

If deterministic root selection is important, consider using the workspace root or a named manifest:

// For workspaces, could prefer the crate matching workspace_root
if result.RootModule == "" && filepath.Dir(pkg.ManifestPath) == meta.WorkspaceRoot {
    result.RootModule = pkg.Name
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/cargo_resolver.go` around lines 83 - 86, The current
logic sets result.RootModule to the first package name encountered
(result.RootModule, pkg.Name) which can be non-deterministic because
meta.Packages order varies; update the selection to prefer a package whose
manifest path is in the workspace root (use meta.WorkspaceRoot and
pkg.ManifestPath via filepath.Dir(pkg.ManifestPath)) before falling back to the
first package, or explicitly document that the first package wins; modify the
loop over meta.Packages to first check and assign result.RootModule when
filepath.Dir(pkg.ManifestPath) == meta.WorkspaceRoot, otherwise retain the
existing fallback assignment.
internal/callgraph/go_parser.go (1)

85-90: Silent error swallowing during file parsing.

Parse errors are silently ignored with continue, which could hide legitimate issues (e.g., syntax errors in source files, permission problems). Consider logging these errors at debug level for troubleshooting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/go_parser.go` around lines 85 - 90, The code silently
ignores errors returned by p.ParseFile(fullPath, packagePath) which swallows
valuable failure details; update the error handling in the block around
p.ParseFile to log the error (including fullPath and packagePath) at debug or
trace level before continuing so parse failures are visible for troubleshooting
while preserving the current behavior; reference the ParseFile call and the
fullPath/packagePath variables when adding the debug log entry.
internal/callgraph/java_parser.go (2)

614-624: Wildcard import resolution returns on first match.

The loop returns immediately on the first wildcard import prefix without iterating through all possibilities. While the comment acknowledges this is "best-effort," if multiple wildcard imports exist (e.g., import java.security.* and import javax.crypto.*), this will always pick the first one arbitrarily, potentially misattributing the call.

If accurate resolution isn't feasible, consider documenting this limitation more explicitly or returning a list of candidates for downstream disambiguation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` around lines 614 - 624, The current loop
over analysis.WildcardImports returns a FunctionID on the first prefix, which
arbitrarily picks one match; update the resolution in the function that builds
FunctionID (referencing analysis.WildcardImports, FunctionID, simpleClass, and
method) to either (a) collect and return all candidate FunctionID entries for
each wildcard prefix instead of returning immediately, or (b) if the function
signature must remain single-result, pick a deterministic strategy (e.g., prefer
exact-package matches or sort prefixes) and add a clear comment/LOG documenting
the limitation; implement one of these changes and update callers/consumers to
handle a list of candidates or the deterministic selection accordingly.

493-493: Extract "argument_list" to a constant.

The static analysis tool flagged this string as having 3 occurrences. Consider adding it to the constants block for consistency with other node type strings.

Suggested fix

Add to the constants block at the top:

const (
	javaNodeIdentifier           = "identifier"
	javaNodeScopedIdentifier     = "scoped_identifier"
	javaNodeGenericType          = "generic_type"
	javaNodeScopedTypeIdentifier = "scoped_type_identifier"
+	javaNodeArgumentList         = "argument_list"
)

Then update the usages accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` at line 493, Add a new constant
javaNodeArgumentList = "argument_list" to the Java node constants block and
replace all literal uses of "argument_list" (there are three occurrences
flagged) with that constant; update comparisons such as child.Type() ==
"argument_list" (and any switch/case or string matches) to use
javaNodeArgumentList so the code uses the shared constant consistently (e.g., in
functions/methods in internal/callgraph/java_parser.go where node type checks
occur).
internal/callgraph/rust_parser_test.go (1)

129-133: Unchecked errors from os.WriteFile.

The return values are ignored. While failures are unlikely in a temp directory, consider checking errors or using a test helper for consistency.

Suggested fix
-	os.WriteFile(filepath.Join(dir, "lib.rs"), []byte("fn foo() {}"), 0o644)
-	// Test file — should be skipped
-	os.WriteFile(filepath.Join(dir, "lib_test.rs"), []byte("fn test_foo() {}"), 0o644)
-	// Another test file
-	os.WriteFile(filepath.Join(dir, "tests.rs"), []byte("fn test_bar() {}"), 0o644)
+	if err := os.WriteFile(filepath.Join(dir, "lib.rs"), []byte("fn foo() {}"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	// Test file — should be skipped
+	if err := os.WriteFile(filepath.Join(dir, "lib_test.rs"), []byte("fn test_foo() {}"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	// Another test file
+	if err := os.WriteFile(filepath.Join(dir, "tests.rs"), []byte("fn test_bar() {}"), 0o644); err != nil {
+		t.Fatal(err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser_test.go` around lines 129 - 133, The
os.WriteFile calls that create "lib.rs", "lib_test.rs", and "tests.rs" ignore
returned errors; update the test to check and fail on these errors (e.g.,
capture the error from os.WriteFile and call t.Fatalf or t.Fatal with the error)
or introduce a small test helper (e.g., mustWriteFile or writeFileOrFail) that
wraps filepath.Join and os.WriteFile and fails the test on error; ensure you
update the three places where os.WriteFile is invoked so any write failure is
reported and the test stops.
internal/callgraph/rust_parser.go (1)

130-130: Cross-language constant reuse is confusing.

Constants like javaNodeScopedIdentifier and goNodeIdentifier are reused for Rust parsing. While this works because tree-sitter shares node type names across grammars, it's confusing for maintainability. Consider defining language-agnostic constants (e.g., nodeIdentifier, nodeScopedIdentifier) in a shared location.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/rust_parser.go` at line 130, The rust parser is using
language-specific constants like javaNodeScopedIdentifier (and elsewhere
goNodeIdentifier) which is confusing; create shared, language-agnostic constants
such as nodeIdentifier and nodeScopedIdentifier in a common package (e.g., the
callgraph package or a new shared constants file) and replace usages in
rust_parser.go (e.g., the switch case currently using javaNodeScopedIdentifier)
with nodeScopedIdentifier and any go-specific usages with nodeIdentifier; update
all other parsers to import and use these new shared symbols so node type names
remain consistent and readable across languages.
internal/callgraph/tracer.go (2)

36-43: Add nil check for t.graph at the start of TraceBack.

While NewTracer constructs the Tracer with a graph, a defensive nil check at the start of TraceBack would prevent potential panics if the tracer is misused.

Proposed fix
 func (t *Tracer) TraceBack(target FunctionID, userPackages map[string]bool, maxDepth int) []CallChain {
+	if t.graph == nil {
+		return nil
+	}
+
 	targetKey := target.String()
 
 	// Check if the target exists in the graph
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/tracer.go` around lines 36 - 43, Add a defensive nil check
at the start of Tracer.TraceBack to avoid panics when the Tracer was constructed
or mutated incorrectly: verify that t != nil and t.graph != nil before accessing
t.graph.Functions, log a debug/error (using log.Debug().Str("target",
target.String())...) and return nil if the graph is nil; update the TraceBack
function (method TraceBack on type Tracer) to perform this check immediately
prior to the existing target existence check.

133-137: Consider using strings.HasPrefix for clarity.

The manual prefix check works correctly but strings.HasPrefix would be more idiomatic and readable.

Proposed fix
 	// Check if pkg is a sub-package of any user package
 	for userPkg := range userPackages {
 		prefix := userPkg + sep
-		if len(pkg) >= len(prefix) && pkg[:len(prefix)] == prefix {
+		if strings.HasPrefix(pkg, prefix) {
 			return true
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/tracer.go` around lines 133 - 137, Replace the manual
slice prefix check in the loop over userPackages with strings.HasPrefix for
clarity and idiomatic code: in the loop that uses variables userPkg, prefix :=
userPkg + sep, and pkg (inside the function containing this loop), call
strings.HasPrefix(pkg, prefix) instead of comparing lengths and slicing, and add
the necessary import for the strings package.
internal/scan/validation.go (2)

31-33: Consider handling all os.Stat errors, not just IsNotExist.

The current check only handles IsNotExist, but other errors (e.g., permission denied) could indicate an unusable target path. Consider returning errors for any os.Stat failure.

Proposed fix
-	if _, err := os.Stat(target); os.IsNotExist(err) {
-		return nil, fmt.Errorf("target path does not exist: %s", target)
+	if _, err := os.Stat(target); err != nil {
+		return nil, fmt.Errorf("target path not accessible: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/validation.go` around lines 31 - 33, The os.Stat call currently
only treats os.IsNotExist(err) specially; update the check to handle all errors
from os.Stat for the target variable (in this validation logic in
internal/scan/validation.go) by returning an error whenever os.Stat returns a
non-nil err (include the original err in the wrapped fmt.Errorf message so
permission and other errors surface), rather than only checking os.IsNotExist.

67-71: Empty language strings are preserved after normalization.

If opts.Languages contains empty strings, they will be normalized to empty strings and included in the result. Consider filtering them out.

Proposed fix
 	// Normalize language hints to lowercase.
-	normalizedLanguages := make([]string, len(opts.Languages))
-	for i, lang := range opts.Languages {
-		normalizedLanguages[i] = strings.ToLower(strings.TrimSpace(lang))
+	normalizedLanguages := make([]string, 0, len(opts.Languages))
+	for _, lang := range opts.Languages {
+		if normalized := strings.ToLower(strings.TrimSpace(lang)); normalized != "" {
+			normalizedLanguages = append(normalizedLanguages, normalized)
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/scan/validation.go` around lines 67 - 71, The normalization loop on
opts.Languages (creating normalizedLanguages) currently preserves empty strings;
change the logic in the normalization block (where normalizedLanguages is built)
to skip/ignore empty results after strings.TrimSpace and strings.ToLower by
using a growing slice (append) or by filtering before assigning, so only
non-empty normalized language strings are included in normalizedLanguages;
reference the normalizedLanguages variable and the iteration over opts.Languages
to locate and update the code.
internal/callgraph/python_parser_test.go (1)

166-169: Handle errors from os.WriteFile in test setup.

The errors from os.WriteFile are silently ignored. While unlikely to fail in a temp directory, handling them improves test reliability.

Proposed fix
 	// Regular file
-	os.WriteFile(filepath.Join(dir, "crypto.py"), []byte("def foo(): pass"), 0o644)
+	if err := os.WriteFile(filepath.Join(dir, "crypto.py"), []byte("def foo(): pass"), 0o644); err != nil {
+		t.Fatal(err)
+	}
 	// Test files — should be skipped
-	os.WriteFile(filepath.Join(dir, "test_crypto.py"), []byte("def test_foo(): pass"), 0o644)
-	os.WriteFile(filepath.Join(dir, "crypto_test.py"), []byte("def test_bar(): pass"), 0o644)
+	if err := os.WriteFile(filepath.Join(dir, "test_crypto.py"), []byte("def test_foo(): pass"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.WriteFile(filepath.Join(dir, "crypto_test.py"), []byte("def test_bar(): pass"), 0o644); err != nil {
+		t.Fatal(err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/python_parser_test.go` around lines 166 - 169, The three
os.WriteFile calls in the test setup are ignoring returned errors; update the
test (in internal/callgraph/python_parser_test.go) to check and handle each
error from os.WriteFile (for the files "crypto.py", "test_crypto.py", and
"crypto_test.py") by capturing the error value and failing the test on error
(e.g., using t.Fatalf or t.Fatal with a clear message and the error) so test
setup failures are reported rather than silently ignored.
internal/callgraph/python_parser.go (2)

154-156: Clarify constant naming: goNodeIdentifier used in Python parser.

The constant goNodeIdentifier (likely defined elsewhere as "identifier") is reused here. While it works since both Go and Python tree-sitter grammars use the same node type name, consider defining a local constant or renaming for clarity.

Proposed fix
 const (
 	pythonNodeDottedName         = "dotted_name"
 	pythonNodeFunctionDefinition = "function_definition"
 	pythonSelfObjectName         = "self"
+	pythonNodeIdentifier         = "identifier"
 )

Then replace goNodeIdentifier with pythonNodeIdentifier throughout the file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/python_parser.go` around lines 154 - 156, Replace the
ambiguous use of goNodeIdentifier inside the Python parser by adding a local
constant pythonNodeIdentifier = "identifier" near the other node constants in
this file, then update all references of goNodeIdentifier within the Python
parser code (e.g., in the node-handling switch where alias is set) to use
pythonNodeIdentifier; keep the original goNodeIdentifier unchanged elsewhere so
Go parser code is unaffected.

79-84: Consider logging parse errors for debugging.

Parse errors are silently ignored, which could make debugging difficult when valid Python files fail to parse. A debug-level log would help troubleshoot parsing issues.

Proposed fix
+	"github.com/rs/zerolog/log"
 ...
 		analysis, err := p.parseFile(fullPath, packagePath)
 		if err != nil {
+			log.Debug().Err(err).Str("file", fullPath).Msg("Failed to parse Python file")
 			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/python_parser.go` around lines 79 - 84, The loop that
calls p.parseFile(fullPath, packagePath) currently swallows errors; update it to
log parse errors at debug level before continuing. Specifically, inside the
block that checks "if err != nil { continue }", call the parser's logger (e.g.,
p.logger.Debugf or p.logger.Errorf if only error level exists, falling back to
log.Printf) to record the error along with fullPath and packagePath and the
error value returned by p.parseFile; keep the continue and still avoid appending
when parsing fails so analyses (the slice being appended to) is unchanged.
docs/DEPENDENCY_SCANNING.md (1)

170-178: Add language specifiers to fenced code blocks.

Static analysis flagged multiple code blocks without language specifiers. For plain text or pseudo-code, use text or plaintext for better markdown linting compliance.

This applies to code blocks at lines 170, 286, 347, 373, 384, 393, 410, 440, 477, 502, 519, 599, 680, 689, 696, 708, 798, 879, 894, and 931. Example fix:

-```
+```text
 Functions:  "crypto/aes.NewCipher"         → FunctionDecl{...}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/DEPENDENCY_SCANNING.md` around lines 170 - 178, Several fenced code
blocks containing output like `Functions:  "crypto/aes.NewCipher" →
FunctionDecl{...}` and `Callers:    "crypto/aes.NewCipher" →
["example.com/app.Encrypt"]` lack language specifiers; update each fenced block
that shows these snippets (e.g., the blocks containing "crypto/aes.NewCipher",
"example.com/app.Encrypt", "example.com/app.main") to include a language such as
`text` or `plaintext` (e.g., change ``` to ```text) so markdown linting
recognizes them as plain text.
internal/dependency/maven_resolver.go (1)

152-166: Consider upfront validation of mvn availability using exec.LookPath.

Error handling for mvn not being found already exists at line 164-165, but checking availability early would fail faster and provide clearer error messaging. This pattern is already used elsewhere in the codebase (see internal/scanner/opengrep/scanner.go), which makes it a reasonable optional improvement for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dependency/maven_resolver.go` around lines 152 - 166, Add an upfront
check for the mvn binary using exec.LookPath("mvn") before creating the command
in the function that runs exec.CommandContext (the block that builds the mvn
dependency:list call using tmpPath and cmd.Dir = targetDir); if LookPath returns
an error, return a clear wrapped error indicating "mvn not found" (or similar)
so the function fails fast with a user-friendly message instead of only
surfacing the exec.Run error and stderr. Ensure the new check mirrors the
pattern used in internal/scanner/opengrep/scanner.go and references the same
"mvn" binary string used in exec.CommandContext.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 946703b1-8000-4fc2-be06-eb060d4d014a

📥 Commits

Reviewing files that changed from the base of the PR and between 038f391 and c526d9a.

⛔ Files ignored due to path filters (5)
  • coverage_targeted.out is excluded by !**/*.out
  • go.sum is excluded by !**/*.sum
  • testdata/projects/cryptowrapper_dep/go.sum is excluded by !**/*.sum
  • testdata/projects/go_with_crypto_dep/go.sum is excluded by !**/*.sum
  • testdata/projects/go_with_dep_chain/go.sum is excluded by !**/*.sum
📒 Files selected for processing (94)
  • .github/workflows/lint.yml
  • .gitignore
  • .golangci-lint-version
  • .goreleaser.yml
  • CHANGELOG.md
  • CONTRIBUTING.md
  • Dockerfile
  • Dockerfile.deps
  • Dockerfile.goreleaser
  • Dockerfile.goreleaser.deps
  • Dockerfile.slim
  • Dockerfile.test
  • Makefile
  • README.md
  • docs/DEPENDENCY_SCANNING.md
  • docs/DOCKER_USAGE.md
  • docs/OUTPUT_FORMATS.md
  • go.mod
  • internal/api/client.go
  • internal/callgraph/builder.go
  • internal/callgraph/core_test.go
  • internal/callgraph/format.go
  • internal/callgraph/format_test.go
  • internal/callgraph/go_parser.go
  • internal/callgraph/go_parser_test.go
  • internal/callgraph/java_parser.go
  • internal/callgraph/java_parser_test.go
  • internal/callgraph/parser_registry.go
  • internal/callgraph/python_parser.go
  • internal/callgraph/python_parser_test.go
  • internal/callgraph/rust_parser.go
  • internal/callgraph/rust_parser_test.go
  • internal/callgraph/text_helpers.go
  • internal/callgraph/tracer.go
  • internal/callgraph/types.go
  • internal/cli/root.go
  • internal/cli/scan.go
  • internal/cli/scan_test.go
  • internal/deduplicator/deduplicator.go
  • internal/dependency/cargo_resolver.go
  • internal/dependency/cargo_resolver_integration_test.go
  • internal/dependency/cargo_resolver_test.go
  • internal/dependency/go_resolver.go
  • internal/dependency/go_resolver_test.go
  • internal/dependency/maven_resolver.go
  • internal/dependency/maven_resolver_test.go
  • internal/dependency/pip_resolver.go
  • internal/dependency/pip_resolver_test.go
  • internal/dependency/registry.go
  • internal/dependency/registry_test.go
  • internal/dependency/resolver.go
  • internal/dependency/source_cache.go
  • internal/dependency/source_cache_test.go
  • internal/engine/dependency_scanner.go
  • internal/engine/dependency_scanner_test.go
  • internal/engine/findings_cache.go
  • internal/engine/findings_cache_test.go
  • internal/engine/orchestrator.go
  • internal/engine/rule_filter.go
  • internal/engine/rule_filter_test.go
  • internal/entities/interim.go
  • internal/entities/interim_additional_test.go
  • internal/errors/formatter.go
  • internal/scan/doc.go
  • internal/scan/duration.go
  • internal/scan/ecosystem.go
  • internal/scan/export.go
  • internal/scan/report.go
  • internal/scan/scan_test.go
  • internal/scan/summary.go
  • internal/scan/validation.go
  • internal/scanner/interface.go
  • internal/scanner/interface_spinner_test.go
  • internal/scanner/opengrep/scanner.go
  • internal/scanner/opengrep/scanner_integration_test.go
  • internal/scanner/registry.go
  • internal/scanner/registry_test.go
  • internal/utils/fd.go
  • internal/utils/fd_test.go
  • internal/version/version.go
  • scanoss.json
  • schemas/interim-report-schema.json
  • testdata/code/python/crypto_usage.py
  • testdata/projects/cryptowrapper_dep/go.mod
  • testdata/projects/cryptowrapper_dep/wrapper.go
  • testdata/projects/go_with_crypto_dep/go.mod
  • testdata/projects/go_with_crypto_dep/main.go
  • testdata/projects/go_with_crypto_dep/mypkg/crypto.go
  • testdata/projects/go_with_dep_chain/go.mod
  • testdata/projects/go_with_dep_chain/main.go
  • testdata/projects/go_with_dep_chain/mypkg/crypto.go
  • testdata/projects/java_with_crypto_dep/pom.xml
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/Main.java
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/service/CryptoService.java
💤 Files with no reviewable changes (1)
  • testdata/code/python/crypto_usage.py
✅ Files skipped from review due to trivial changes (2)
  • internal/scanner/registry_test.go
  • CONTRIBUTING.md
🚧 Files skipped from review as they are similar to previous changes (29)
  • internal/scan/summary.go
  • internal/scan/ecosystem.go
  • internal/callgraph/go_parser_test.go
  • internal/callgraph/java_parser_test.go
  • internal/errors/formatter.go
  • internal/scan/export.go
  • internal/scanner/interface_spinner_test.go
  • testdata/projects/java_with_crypto_dep/pom.xml
  • internal/engine/findings_cache_test.go
  • internal/dependency/registry.go
  • internal/dependency/cargo_resolver_test.go
  • internal/engine/rule_filter_test.go
  • internal/entities/interim_additional_test.go
  • Dockerfile.goreleaser.deps
  • internal/dependency/resolver.go
  • scanoss.json
  • testdata/projects/go_with_dep_chain/mypkg/crypto.go
  • internal/utils/fd_test.go
  • internal/callgraph/builder.go
  • testdata/projects/go_with_crypto_dep/mypkg/crypto.go
  • go.mod
  • internal/cli/scan_test.go
  • internal/deduplicator/deduplicator.go
  • internal/api/client.go
  • testdata/projects/cryptowrapper_dep/wrapper.go
  • testdata/projects/java_with_crypto_dep/src/main/java/com/testapp/service/CryptoService.java
  • docs/DOCKER_USAGE.md
  • Dockerfile
  • testdata/projects/go_with_dep_chain/main.go

Comment on lines +28 to +29
version-file: .golangci-lint-version
install-mode: binary
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

golangci/golangci-lint-action v8 supported inputs documentation

💡 Result:

golangci/golangci-lint-action@v8 — supported with: inputs

From the action metadata (action.yml), v8 supports these inputs: [1]

  • version — golangci-lint version to use (behavior depends on install-mode).
  • install-mode (default: binary) — install method: binary, goinstall, or none.
  • working-directory — run golangci-lint from this subdirectory (monorepos, etc.).
  • github-token (default: ${{ github.token }}) — used to fetch patches/diffs for only-new-issues.
  • verify (default: true) — validate the golangci-lint config against the JSON schema.
  • only-new-issues (default: false) — report only issues introduced by the change (PR/push context).
  • skip-cache (default: false) — disable caching entirely.
  • skip-save-cache (default: false) — don’t save caches (may still restore them).
  • problem-matchers (default: false) — force use of embedded problem matchers.
  • args (default: "") — extra CLI args passed to golangci-lint.
  • cache-invalidation-interval (default: 7) — periodically invalidate caches (days).

Notes:

  • v8 runs on the Node 20 runtime (runs.using: node20). [1]
  • The repo README’s “Options” section mirrors these inputs (helpful for examples/usage). [2]

Sources: action metadata for v8 [1], repo README/options docs [2].

References: [1] https://raw.githubusercontent.com/golangci/golangci-lint-action/v8/action.yml, [2] https://github.com/golangci/golangci-lint-action


🏁 Script executed:

cd .github/workflows && cat -n lint.yml | head -50

Repository: scanoss/crypto-finder

Length of output: 883


version-file is not a supported input for this action.

golangci/golangci-lint-action@v8 only supports the version input (a version string), not version-file. Read .golangci-lint-version and pass its contents to version instead.

🛠️ Suggested wiring
       - name: Setup Version
         run: make version
+
+      - name: Read golangci-lint version
+        id: golangci_lint_version
+        run: echo "version=$(cat .golangci-lint-version)" >> "$GITHUB_OUTPUT"
 
       - name: golangci-lint
         uses: golangci/golangci-lint-action@v8
         with:
-          version-file: .golangci-lint-version
+          version: ${{ steps.golangci_lint_version.outputs.version }}
           install-mode: binary
           args: --timeout=5m
🧰 Tools
🪛 actionlint (1.7.11)

[error] 28-28: input "version-file" is not defined in action "golangci/golangci-lint-action@v8". available inputs are "args", "cache-invalidation-interval", "github-token", "install-mode", "only-new-issues", "problem-matchers", "skip-cache", "skip-save-cache", "verify", "version", "working-directory"

(action)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/lint.yml around lines 28 - 29, The workflow currently uses
the unsupported input version-file for golangci/golangci-lint-action@v8; change
the action inputs to use the supported version input by reading the contents of
.golangci-lint-version into the version field (e.g., add a preceding step that
reads the file into an output or env, then set version: ${{
steps.<name>.outputs.version }}), remove version-file, and keep install-mode:
binary as-is so the action receives a proper version string via the version
input.


# Install OpenGrep (minimum version 1.12.1)
RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/main/install.sh | bash
RUN curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.12.1/install.sh | bash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh api repos/opengrep/opengrep/releases/tags/v1.12.1 | jq -r '.assets[].name'

Repository: scanoss/crypto-finder

Length of output: 153


🌐 Web query:

OpenGrep v1.12.1 release checksums verification

💡 Result:

To verify OpenGrep v1.12.1 release checksums, you’re essentially verifying that the file you downloaded matches the SHA-256 hash published alongside the release asset on GitHub.

  1. Get the expected SHA-256 from the official release
  • Open the v1.12.1 release page (released Dec 2, 2025) and expand Assets.
  • For each asset, GitHub shows a sha256:... value—copy the one for the exact filename you downloaded. [1]
  1. Compute the SHA-256 of your downloaded file and compare

macOS / Linux

shasum -a 256 /path/to/opengrep-<asset>.tar.gz
# or
sha256sum /path/to/opengrep-<asset>.tar.gz

Windows (PowerShell)

Get-FileHash -Algorithm SHA256 C:\path\to\opengrep-<asset>.zip
  1. Verify
  • The computed SHA-256 must exactly match the sha256: shown for that asset on the GitHub release page. [1]

If you want, paste (a) the asset filename you downloaded and (b) the SHA-256 GitHub shows for it, and I’ll tell you exactly what command to run and what the output should look like.

Sources: [1]


🏁 Script executed:

#!/bin/bash
# Try to fetch the release information directly from GitHub without authentication
curl -s https://api.github.com/repos/opengrep/opengrep/releases/tags/v1.12.1 | jq -r '.assets[].name'

Repository: scanoss/crypto-finder

Length of output: 768


Use verified release artifacts instead of piping the OpenGrep installer script to bash.

The v1.12.1 release publishes cryptographically signed binaries (.sig and .cert files for manylinux, musllinux, macOS, and Windows platforms), but the Dockerfile installs from the raw install.sh script, which is not a verified release artifact and remains a supply-chain risk. Replace the installer script with a direct download of a signed release binary for your target platform, then verify the signature before execution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile.goreleaser` at line 25, Replace the unsafe piping of the OpenGrep
install.sh (the RUN curl -fsSL .../install.sh | bash invocation for v1.12.1)
with a secure flow that downloads the platform-specific signed release artifact
(the tarball/zip binary for v1.12.1) and its corresponding .sig/.cert, verify
the signature with the project's signing key (using gpg/openssl) against the
included certificate before extracting/placing the binary, and only then make
the binary executable and add it to PATH; ensure you reference the v1.12.1
release assets and include commands to fetch and import the maintainer public
key, verify the signature, and fail the build if verification does not succeed.

Comment on lines +79 to +88
case '<':
angleDepth++
case '>':
if angleDepth > 0 {
angleDepth--
}
case ',':
if parenDepth == 0 && bracketDepth == 0 && braceDepth == 0 && angleDepth == 0 {
flush()
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't treat every < as generic nesting.

Line 79 increments angleDepth for any <, so splitTopLevelCommaList("a < b, c") will never split at the comma and returns one item instead of two. The same problem shows up with shifts like a << b, c. This will misparse real argument expressions unless angle brackets are handled contextually or delegated to language-specific parsers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/text_helpers.go` around lines 79 - 88, The parser in
splitTopLevelCommaList (text_helpers.go) treats every '<' and '>' as
template-angle nesting, causing expressions like "a < b, c" or shifts "a << b,
c" to mis-split; change the handling so angleDepth is only
incremented/decremented when the token is clearly a template delimiter: on
encountering '<' check a contextual heuristic (previous non-space char is an
identifier char/digit/underscore or one of ')>]'}' indicating an expression/type
before a template) and that the next char is not another '<' (shift); similarly
for '>' ensure you aren't in a '>>' shift or comparing operator by peeking
surrounding chars before decrementing angleDepth. Update the '<' and '>' cases
to use these contextual checks instead of blindly changing angleDepth.

Comment on lines +99 to +102
func trimOuterDelimiters(s string, open, close rune) string {
s = strings.TrimSpace(s)
if len(s) >= 2 && rune(s[0]) == open && rune(s[len(s)-1]) == close {
return strings.TrimSpace(s[1 : len(s)-1])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and read the file
find . -name "text_helpers.go" -path "*/callgraph/*"

Repository: scanoss/crypto-finder

Length of output: 101


🏁 Script executed:

# Read the specific file around the mentioned lines
cat -n internal/callgraph/text_helpers.go | sed -n '95,110p'

Repository: scanoss/crypto-finder

Length of output: 567


🏁 Script executed:

# Check if there's a revive configuration in the repo
find . -name ".revive.toml" -o -name "revive.yml" -o -name ".revive.yml" | head -5

Repository: scanoss/crypto-finder

Length of output: 47


🏁 Script executed:

# Search for any CI configuration mentioning revive
find . -name "*.yml" -o -name "*.yaml" -o -name ".revive*" | grep -E "(github|gitlab|ci|revive)" | head -10

Repository: scanoss/crypto-finder

Length of output: 239


🏁 Script executed:

# Let's also check if there are other functions in the file that might have similar issues
rg "func.*\(.*close" internal/callgraph/text_helpers.go -A 3

Repository: scanoss/crypto-finder

Length of output: 266


🏁 Script executed:

# Check the golangci.yml for revive configuration
cat -n .golangci.yml

Repository: scanoss/crypto-finder

Length of output: 6723


🏁 Script executed:

# Search for all usages of the parameter 'close' in the function to ensure the fix is complete
sed -n '99,105p' internal/callgraph/text_helpers.go | grep -E "close"

Repository: scanoss/crypto-finder

Length of output: 196


🏁 Script executed:

# Let's also check the function calls to see how the parameter is passed
rg "trimOuterDelimiters" internal/callgraph/text_helpers.go -B 2 -A 2

Repository: scanoss/crypto-finder

Length of output: 376


🏁 Script executed:

# Verify if the Go built-in 'close' is used anywhere in this file that would justify the shadowing concern
rg "\bclose\(" internal/callgraph/text_helpers.go

Repository: scanoss/crypto-finder

Length of output: 47


Rename close parameter to unblock revive lint.

The parameter close at line 99 shadows Go's built-in function. The redefines-builtin-id rule in revive is enabled and will flag this. Rename the parameter to closing to resolve the lint failure.

Suggested diff
-func trimOuterDelimiters(s string, open, close rune) string {
+func trimOuterDelimiters(s string, open, closing rune) string {
 	s = strings.TrimSpace(s)
-	if len(s) >= 2 && rune(s[0]) == open && rune(s[len(s)-1]) == close {
+	if len(s) >= 2 && rune(s[0]) == open && rune(s[len(s)-1]) == closing {
 		return strings.TrimSpace(s[1 : len(s)-1])
 	}
 	return s
}
🧰 Tools
🪛 GitHub Check: build

[failure] 99-99:
redefines-builtin-id: redefinition of the built-in function close (revive)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/text_helpers.go` around lines 99 - 102, The parameter name
`close` in function trimOuterDelimiters shadows the built-in and triggers
revive's redefines-builtin-id rule; rename the parameter to `closing` in the
function signature and update its use inside the body (e.g., replace comparisons
that use `close` with `closing`) so the function trimOuterDelimiters(open,
closing rune) no longer conflicts with the builtin.

Comment on lines +140 to +146
scanCmd.Flags().BoolVar(&scanDependencies, "scan-dependencies", false, "Enable recursive dependency scanning for cryptographic usage")
scanCmd.Flags().IntVar(&scanDepMaxDepth, "dep-max-depth", 3, "Maximum depth for recursive dependency resolution")
scanCmd.Flags().StringVar(&scanDepEcosystem, "dep-ecosystem", "auto", "Dependency ecosystem: auto, go, java, python, rust")
scanCmd.Flags().BoolVar(&scanDepUnreachable, "dep-include-unreachable", false, "Include crypto findings in dependencies not reachable from user code call graph")
scanCmd.Flags().IntVar(&scanDepWorkers, "dep-workers", 0, "Number of parallel dependency scan workers (default: half of CPU cores, max 8)")
scanCmd.Flags().StringVar(&scanExportCallgraph, "export-callgraph", "", "Export the crypto-scoped call graph to a file (requires --scan-dependencies)")
scanCmd.Flags().StringVar(&scanExportCgFormat, "export-callgraph-format", "json", "Call graph export format: json, dot, text")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate the new dependency flags up front.

The validation call on Lines 154-167 still ignores --dep-ecosystem and --export-callgraph-format. A typo like --dep-ecosystem pyhton only logs a warning later and silently skips dependency scanning, while an invalid export format is rejected only after the scan has already run.

Also applies to: 154-167

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/cli/scan.go` around lines 140 - 146, Add upfront validation for the
dependency ecosystem and callgraph export format in the same pre-run validation
block that already checks other flags: validate the variable scanDepEcosystem
against allowed values {"auto","go","java","python","rust"} and validate
scanExportCgFormat against {"json","dot","text"}; on invalid values emit a clear
error and abort (same behavior as other flag validation) so typos like
"--dep-ecosystem pyhton" or invalid "--export-callgraph-format" are caught
before scanning begins. Ensure you reference and check the existing flag
variables scanDepEcosystem and scanExportCgFormat in that validation code path.

Comment on lines 29 to +30
// InterimFormatVersion is the current version of the interim report schema.
const InterimFormatVersion = "1.1"
const InterimFormatVersion = "1.3"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This schema version bump does not match the stated output contract.

The acceptance criteria say dependency-scan outputs should emit interim version 1.2, but this changes every report to 1.3. That will break consumers still validating against 1.2 unless the schema/docs/release plan are all being updated in lockstep.

Proposed fix
-const InterimFormatVersion = "1.3"
+const InterimFormatVersion = "1.2"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// InterimFormatVersion is the current version of the interim report schema.
const InterimFormatVersion = "1.1"
const InterimFormatVersion = "1.3"
// InterimFormatVersion is the current version of the interim report schema.
const InterimFormatVersion = "1.2"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/entities/interim.go` around lines 29 - 30, The constant
InterimFormatVersion currently set to "1.3" breaks the stated output contract;
change InterimFormatVersion back to "1.2" so dependency-scan outputs remain
compatible with consumers expecting the v1.2 interim schema, and add a brief
comment next to InterimFormatVersion noting that any intentional version bump
must be coordinated with schema/docs/release plan and consumers.

Comment on lines +38 to +40
lint-install: ## Install pinned golangci-lint version
@mkdir -p ./bin
@curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b ./bin $(GOLANGCI_LINT_VERSION)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the Makefile
find . -name "Makefile" -type f | head -20

Repository: scanoss/crypto-finder

Length of output: 75


🏁 Script executed:

# Read the Makefile around lines 38-40 to see the context
sed -n '30,50p' Makefile

Repository: scanoss/crypto-finder

Length of output: 881


🏁 Script executed:

# Also look for GOLANGCI_LINT_VERSION definition
grep -n "GOLANGCI_LINT_VERSION" Makefile

Repository: scanoss/crypto-finder

Length of output: 354


🏁 Script executed:

# Check if there are any other installations or references to golangci-lint
grep -n "golangci-lint" Makefile

Repository: scanoss/crypto-finder

Length of output: 455


🏁 Script executed:

# Check the .golangci-lint-version file content
cat .golangci-lint-version

Repository: scanoss/crypto-finder

Length of output: 72


🏁 Script executed:

# Check if there's any documentation or comments about this approach
grep -B5 -A5 "lint-install" Makefile

Repository: scanoss/crypto-finder

Length of output: 736


🏁 Script executed:

# Look for any other references to how golangci-lint should be installed
grep -r "golangci-lint" . --include="*.md" --include="*.txt" | head -20

Repository: scanoss/crypto-finder

Length of output: 372


Avoid piping golangci-lint's mutable installer into sh.

Line 40 fetches install.sh from the master branch, which is mutable and makes make lint-install non-reproducible. Although the version (v2.10.1) is pinned in .golangci-lint-version, the installer script itself can change without any repository diff, causing different behavior across runs. Pin the installer script to a specific commit/tag, download from a release artifact, or host it in the repository instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 38 - 40, The lint-install target in the Makefile
currently pipes the mutable installer from the master branch into sh (the curl |
sh line in the lint-install recipe); change it to a reproducible fetch by
pinning the installer source—either download the installer script at a specific
commit/tag (use the raw URL with a commit SHA), or download the golangci-lint
release binary/archive from the GitHub release matching GOLANGCI_LINT_VERSION,
or vendor the installer into the repo—and then run the local copy (avoid piping
from remote into sh). Update the lint-install recipe to fetch the pinned asset
and verify its checksum before executing.

Comment on lines +141 to +147
docker-scan-deps: ## Scan with dependency resolution (PROJ=path/to/project)
@echo "Scanning with dependency resolution..."
@docker run --rm \
-v $(PROJ):/workspace/code:ro \
--entrypoint sh $(DOCKER_IMAGE):latest-deps -c \
"if [ -f /workspace/code/requirements.txt ]; then pip install -r /workspace/code/requirements.txt 2>/dev/null; fi; \
crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't hide dependency-install failures before the scan.

Line 146 suppresses pip install errors and Line 147 still runs the scan unconditionally. That can turn docker-scan-deps into a partial dependency scan while looking successful, which is especially risky for the new reachability/dependency attribution flow.

🛠️ Fail fast instead
-		"if [ -f /workspace/code/requirements.txt ]; then pip install -r /workspace/code/requirements.txt 2>/dev/null; fi; \
-		 crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
+		"set -e; \
+		 if [ -f /workspace/code/requirements.txt ]; then \
+		   pip install -r /workspace/code/requirements.txt; \
+		 fi; \
+		 crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
docker-scan-deps: ## Scan with dependency resolution (PROJ=path/to/project)
@echo "Scanning with dependency resolution..."
@docker run --rm \
-v $(PROJ):/workspace/code:ro \
--entrypoint sh $(DOCKER_IMAGE):latest-deps -c \
"if [ -f /workspace/code/requirements.txt ]; then pip install -r /workspace/code/requirements.txt 2>/dev/null; fi; \
crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
docker-scan-deps: ## Scan with dependency resolution (PROJ=path/to/project)
`@echo` "Scanning with dependency resolution..."
`@docker` run --rm \
-v $(PROJ):/workspace/code:ro \
--entrypoint sh $(DOCKER_IMAGE):latest-deps -c \
"set -e; \
if [ -f /workspace/code/requirements.txt ]; then \
pip install -r /workspace/code/requirements.txt; \
fi; \
crypto-finder scan /workspace/code --scan-dependencies $(ARGS)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 141 - 147, The docker-scan-deps Makefile target
currently hides pip install failures by redirecting stderr (the "pip install -r
/workspace/code/requirements.txt 2>/dev/null" in the docker run command) and
then always running the scan; change this so dependency-install failures fail
the step and prevent the scan: remove the stderr suppression and make the scan
conditional on pip succeeding (e.g. run pip install without "2>/dev/null" and
chain the scan command with "&&" or use a failing shell option like "set -e"
inside the container), updating the docker-scan-deps target to run the scan only
when pip install succeeds.

Comment on lines +11 to +12
"description": "Version of the interim report schema (e.g., \"1.3\")",
"examples": ["1.3", "1.2", "1.1", "1.0"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The published schema no longer cleanly represents one versioned contract.

Line 11 documents 1.3 as the current schema, Line 12 still lists 1.0/1.1/1.2, and Lines 75-95 only allow the new start_line/end_line asset shape. Older report shapes that still emit line_number will fail validation here, and the PR acceptance criteria still describe 1.2 output. Either publish this as a 1.3-only schema or add versioned alternatives for the legacy asset shape.

Also applies to: 73-96

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@schemas/interim-report-schema.json` around lines 11 - 12, The schema
currently claims version "1.3" but only allows the new asset shape
(start_line/end_line) and still lists older versions in "examples", causing
older reports that emit "line_number" to fail validation; update
interim-report-schema.json either by making it strictly 1.3-only (remove older
examples and adjust description to "1.3" only and keep the new
start_line/end_line asset definition) or add explicit versioned alternatives:
keep "examples" for older versions and add a conditional/oneOf branch in the
assets/report entry (referencing the asset shape that uses "line_number" vs the
new shape using "start_line" and "end_line") so validators accept legacy reports
when "version" is "1.0"/"1.1"/"1.2" and enforce the new shape when "version" is
"1.3".

Comment on lines +124 to +131
"source": {
"type": "string",
"description": "How this finding was discovered (v1.2+)",
"enum": ["direct", "dependency"]
},
"dependency_info": {
"$ref": "#/$defs/DependencyInfo",
"description": "Attribution data when the finding originates from a dependency (v1.2+)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Require dependency_info whenever source is dependency.

Lines 124-131 introduce dependency attribution, but the schema still accepts a finding with "source": "dependency" and no module/version payload. That weakens the new contract and lets incomplete dependency findings validate successfully.

💡 JSON Schema guard for dependency findings
       "oneOf": [
         {"required": ["rules"]},
         {"required": ["rule"]}
-      ]
+      ],
+      "allOf": [
+        {
+          "if": {
+            "properties": {
+              "source": { "const": "dependency" }
+            },
+            "required": ["source"]
+          },
+          "then": {
+            "required": ["dependency_info"]
+          }
+        }
+      ]

Also applies to: 149-167

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@schemas/interim-report-schema.json` around lines 124 - 131, The schema
currently allows "source": "dependency" without requiring the dependency
payload; add a JSON Schema conditional so that when the property "source" has
the value "dependency" the "dependency_info" property (the $ref to
`#/`$defs/DependencyInfo) is required — implement this with an "if":
{"properties": {"source": {"const": "dependency"}}}, "then": {"required":
["dependency_info"]} (or equivalent "dependentRequired"/oneOf pattern if
preferred), and apply the same conditional to the other occurrence of these
fields so dependency findings cannot validate without dependency_info.

@github-actions
Copy link

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit 5025367

Files with similarities:

  • internal/callgraph/text_helpers.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 61
  • Detected files undeclared: 1
  • Detected files declared: 60
  • Licenses detected: 5
  • Licenses detected with copyleft: 3
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/callgraph/builder.go`:
- Around line 173-190: The code in builder.go that collects targets from
methodsByName using methodLookupName(calleeDecl.ID.Name) currently only filters
by OwnerType != "interface", parameter arity, and namespaceRoot, which lets
sibling concrete methods be aliased to the interface method; update the loop
over targets to also verify that the concrete candidate's declaring type
actually implements the interface represented by calleeDecl (i.e., add a check
that candidate.ID.Owner/receiver type implements the interface
calleeDecl.OwnerType or calleeDecl.ID) and skip the candidate if not; implement
this via an existing type registry/type-check helper or add a small helper like
typeImplements(interfaceType, candidateOwnerType) and call it before appending
to results so only concrete methods whose types implement the interface are
considered.
- Around line 224-238: The current arity check only applies when
len(call.Arguments) > 0 which allows zero-argument fluent calls (e.g.,
builder.compact()) to match every overload; update the filtering in the loop
over targets (use methodsByName/methodLookupName -> targets -> for _, candidate
:= range targets) to always compare parameter counts (replace the conditional so
it always does if len(candidate.Parameters) != len(call.Arguments) { continue })
so only candidates with matching arity survive; consider keeping this check even
for zero-argument calls (and handle variadic/optional params later if needed).

In `@internal/callgraph/java_parser_test.go`:
- Around line 159-163: The test fixture in java_parser_test.go uses an Imports
map entry like "java.util.Collections": "java.util" which cannot be produced by
extractImports (it stores simple type names as keys, e.g. "Collections"). Update
the test data to match extractImports output: replace fully-qualified keys (e.g.
"java.util.Collections") with their simple names ("Collections") and ensure
other entries (e.g. "java.util.List" or "Cipher") follow the same simple-key
convention; verify the test assertions still reference the simple keys and
adjust lines around the Imports map and any checks that expect fully-qualified
keys accordingly so the test exercises the real parser path used by
extractImports.

In `@internal/callgraph/java_parser.go`:
- Around line 761-771: The current loop over analysis.WildcardImports returns a
FunctionID for the first prefix unconditionally, causing order-dependent
resolution; change the behavior so that you only construct and return the
wildcard-based FunctionID when there is exactly one wildcard import (check
len(analysis.WildcardImports) == 1) and otherwise do not return from the loop
(let unresolved class remain unresolved). Update the logic around the loop that
currently returns a FunctionID with Package: prefix, Type: simpleClass, Name:
method so that it only does so for the single-wildcard case, leaving resolution
to other checks when there are multiple wildcard imports.
- Around line 146-164: The fast-path that returns when object == "" (around the
call resolution code handling empty-object calls) must detect statically
imported members by using methodLookupName(method) to strip the arity suffix and
then consult analysis.Imports and analysis.WildcardImports before bailing out;
specifically, replace the early return with logic that: 1) computes baseName :=
methodLookupName(method), 2) if analysis.Imports[baseName] exists use that
import (set object to the imported package + "." + baseName or otherwise
attribute the call to that type), 3) else check analysis.WildcardImports for a
matching prefix and attribute accordingly, and only return as local if no import
match is found; this uses the existing import parsing that populates
analysis.Imports and analysis.WildcardImports and preserves methodLookupName
visibility within the callgraph package.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e10fa58-1354-4290-89dd-a70882ea347b

📥 Commits

Reviewing files that changed from the base of the PR and between c526d9a and 5025367.

📒 Files selected for processing (4)
  • internal/callgraph/builder.go
  • internal/callgraph/core_test.go
  • internal/callgraph/java_parser.go
  • internal/callgraph/java_parser_test.go

Comment on lines +173 to +190
targets := methodsByName[methodLookupName(calleeDecl.ID.Name)]
if len(targets) == 0 {
return nil
}

baseRoot := namespaceRoot(calleeDecl.ID.Package)
results := make([]string, 0, len(targets))
for _, candidate := range targets {
if candidate.OwnerType == "interface" {
continue
}
if len(candidate.Parameters) != len(calleeDecl.Parameters) {
continue
}
if namespaceRoot(candidate.ID.Package) != baseRoot {
continue
}
results = append(results, candidate.ID.String())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Interface expansion currently aliases unrelated methods.

This path only checks base name, arity, and namespace root. Any sibling method with the same signature shape inside that namespace becomes reachable even if its type never implements the interface, which will over-report call_chains and dependency attribution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/builder.go` around lines 173 - 190, The code in builder.go
that collects targets from methodsByName using
methodLookupName(calleeDecl.ID.Name) currently only filters by OwnerType !=
"interface", parameter arity, and namespaceRoot, which lets sibling concrete
methods be aliased to the interface method; update the loop over targets to also
verify that the concrete candidate's declaring type actually implements the
interface represented by calleeDecl (i.e., add a check that
candidate.ID.Owner/receiver type implements the interface calleeDecl.OwnerType
or calleeDecl.ID) and skip the candidate if not; implement this via an existing
type registry/type-check helper or add a small helper like
typeImplements(interfaceType, candidateOwnerType) and call it before appending
to results so only concrete methods whose types implement the interface are
considered.

Comment on lines +224 to +238
targets := methodsByName[methodLookupName(call.Callee.Name)]
if len(targets) == 0 {
return nil
}

rootNS := namespaceRoot(rootPkg)
type scored struct {
key string
score int
}
scoredTargets := make([]scored, 0, len(targets))
for _, candidate := range targets {
if len(call.Arguments) > 0 && len(candidate.Parameters) != len(call.Arguments) {
continue
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Zero-argument fluent calls match every overload.

Line 236 only filters by arity when len(call.Arguments) > 0. For builder.compact()-style calls, every compact#N candidate in the namespace survives scoring, so unrelated overloads can be marked reachable.

🔧 Tighten overload filtering
-		if len(call.Arguments) > 0 && len(candidate.Parameters) != len(call.Arguments) {
+		if len(candidate.Parameters) != len(call.Arguments) {
 			continue
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
targets := methodsByName[methodLookupName(call.Callee.Name)]
if len(targets) == 0 {
return nil
}
rootNS := namespaceRoot(rootPkg)
type scored struct {
key string
score int
}
scoredTargets := make([]scored, 0, len(targets))
for _, candidate := range targets {
if len(call.Arguments) > 0 && len(candidate.Parameters) != len(call.Arguments) {
continue
}
targets := methodsByName[methodLookupName(call.Callee.Name)]
if len(targets) == 0 {
return nil
}
rootNS := namespaceRoot(rootPkg)
type scored struct {
key string
score int
}
scoredTargets := make([]scored, 0, len(targets))
for _, candidate := range targets {
if len(candidate.Parameters) != len(call.Arguments) {
continue
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/builder.go` around lines 224 - 238, The current arity
check only applies when len(call.Arguments) > 0 which allows zero-argument
fluent calls (e.g., builder.compact()) to match every overload; update the
filtering in the loop over targets (use methodsByName/methodLookupName ->
targets -> for _, candidate := range targets) to always compare parameter counts
(replace the conditional so it always does if len(candidate.Parameters) !=
len(call.Arguments) { continue }) so only candidates with matching arity
survive; consider keeping this check even for zero-argument calls (and handle
variadic/optional params later if needed).

Comment on lines +159 to +163
Imports: map[string]string{
"Cipher": "javax.crypto",
"java.util.Collections": "java.util",
"CryptoService": "com.dep",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This import fixture cannot be produced by extractImports.

extractImports stores import java.util.Collections; as Imports["Collections"] = "java.util", not Imports["java.util.Collections"] = "java.util". As written, this test exercises a state the parser never creates, so it can stay green while the real resolution path regresses.

🧪 Proposed fixture correction
 	analysis := &FileAnalysis{
 		PackagePath: "com.example",
 		Imports: map[string]string{
 			"Cipher":                "javax.crypto",
-			"java.util.Collections": "java.util",
+			"Collections":           "java.util",
 			"CryptoService":         "com.dep",
 		},
 		WildcardImports: []string{"java.security"},
 	}
@@
-	callee = p.resolveCallee("java.util.Collections", "emptyList", analysis, nil)
-	if callee.Package != "java.util" || callee.Type != "java.util.Collections" {
+	callee = p.resolveCallee("Collections", "emptyList", analysis, nil)
+	if callee.Package != "java.util" || callee.Type != "Collections" {
 		t.Fatalf("unexpected full-object import callee: %#v", callee)
 	}

Also applies to: 177-180

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser_test.go` around lines 159 - 163, The test
fixture in java_parser_test.go uses an Imports map entry like
"java.util.Collections": "java.util" which cannot be produced by extractImports
(it stores simple type names as keys, e.g. "Collections"). Update the test data
to match extractImports output: replace fully-qualified keys (e.g.
"java.util.Collections") with their simple names ("Collections") and ensure
other entries (e.g. "java.util.List" or "Cipher") follow the same simple-key
convention; verify the test assertions still reference the simple keys and
adjust lines around the Imports map and any checks that expect fully-qualified
keys accordingly so the test exercises the real parser path used by
extractImports.

Comment on lines +146 to +164
importText := child.Content(src)
// Remove "import " prefix and ";" suffix
importText = strings.TrimPrefix(importText, "import ")
importText = strings.TrimPrefix(importText, "static ")
importText = strings.TrimSuffix(importText, ";")
importText = strings.TrimSpace(importText)

if strings.HasSuffix(importText, ".*") {
// Wildcard import: import java.security.*
prefix := strings.TrimSuffix(importText, ".*")
analysis.WildcardImports = append(analysis.WildcardImports, prefix)
} else {
// Explicit import: import javax.crypto.Cipher → imports["Cipher"] = "javax.crypto"
lastDot := strings.LastIndex(importText, ".")
if lastDot > 0 {
className := importText[lastDot+1:]
pkg := importText[:lastDot]
analysis.Imports[className] = pkg
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the structure of the file and examine the relevant sections
wc -l internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 102


🏁 Script executed:

# Read lines 146-164 to see the import parsing logic
sed -n '146,164p' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 846


🏁 Script executed:

# Read lines 711-779 to see the resolveCallee function
sed -n '711,779p' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 1931


🏁 Script executed:

# Find and examine the methodLookupName function
rg 'func.*methodLookupName' internal/callgraph/java_parser.go -A 5

Repository: scanoss/crypto-finder

Length of output: 47


🏁 Script executed:

# Check how analysis.Imports is used throughout the file
rg 'analysis\.Imports\[' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 248


🏁 Script executed:

# Search for methodLookupName across the entire codebase
rg 'methodLookupName' internal/callgraph/

Repository: scanoss/crypto-finder

Length of output: 503


🏁 Script executed:

# Check if there are any other references to this function in Go files
find . -name "*.go" -type f | xargs grep -l 'methodLookupName' 2>/dev/null || echo "Not found"

Repository: scanoss/crypto-finder

Length of output: 96


🏁 Script executed:

# Let's also see the full context around resolveCallee to understand the function signature
sed -n '700,780p' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 2167


🏁 Script executed:

# Get the implementation of methodLookupName from builder.go
rg 'func methodLookupName' internal/callgraph/builder.go -A 10

Repository: scanoss/crypto-finder

Length of output: 292


🏁 Script executed:

# Let's also check how methodLookupName is used to understand its purpose
rg 'methodLookupName' internal/callgraph/builder.go -B 2 -A 2

Repository: scanoss/crypto-finder

Length of output: 721


🏁 Script executed:

# Now let's check the exact import parsing to understand what key is stored for static imports
sed -n '140,170p' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 1167


🏁 Script executed:

# Find all calls to resolveCallee
rg 'resolveCallee\(' internal/callgraph/java_parser.go -B 3 -A 1

Repository: scanoss/crypto-finder

Length of output: 618


🏁 Script executed:

# Find the javaMethodWithArity function definition
rg 'func javaMethodWithArity' internal/callgraph/java_parser.go -A 5

Repository: scanoss/crypto-finder

Length of output: 233


🏁 Script executed:

# Let's also check if there's a constructorMethodName definition
rg 'constructorMethodName|javaMethodWithArity' internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 463


🏁 Script executed:

# Check if methodLookupName is exported (should be capital M to be exported)
grep -n 'func methodLookupName\|func MethodLookupName' internal/callgraph/builder.go

Repository: scanoss/crypto-finder

Length of output: 112


🏁 Script executed:

# Let's check the imports in java_parser.go to see what packages are available
head -30 internal/callgraph/java_parser.go

Repository: scanoss/crypto-finder

Length of output: 802


Statically imported methods still resolve as local calls.

Lines 146-164 record import static ...member, but lines 711-718 return immediately when object == "" without checking analysis.Imports. Calls like emptyList() get indexed under the current package instead of their imported owner type, causing statically imported APIs to lose proper dependency attribution.

The proposed fix is valid: use methodLookupName(method) to strip the arity suffix (e.g., "emptyList#0" → "emptyList") before checking analysis.Imports, since both java_parser.go and builder.go are in the same callgraph package and methodLookupName is accessible as a private function.

🔧 Minimal fix around the fast path
 func (p *JavaParser) resolveCallee(object, method string, analysis *FileAnalysis, varTypes map[string]string) FunctionID {
 	if object == "" {
+		// Static member imports: import static foo.bar.Baz.qux;
+		if owner, ok := analysis.Imports[methodLookupName(method)]; ok {
+			if lastDot := strings.LastIndex(owner, "."); lastDot > 0 {
+				return FunctionID{
+					Package: owner[:lastDot],
+					Type:    owner[lastDot+1:],
+					Name:    method,
+				}
+			}
+		}
 		// Simple local call like `doSomething()`
 		return FunctionID{
 			Package: analysis.PackagePath,
 			Name:    method,
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` around lines 146 - 164, The fast-path that
returns when object == "" (around the call resolution code handling empty-object
calls) must detect statically imported members by using methodLookupName(method)
to strip the arity suffix and then consult analysis.Imports and
analysis.WildcardImports before bailing out; specifically, replace the early
return with logic that: 1) computes baseName := methodLookupName(method), 2) if
analysis.Imports[baseName] exists use that import (set object to the imported
package + "." + baseName or otherwise attribute the call to that type), 3) else
check analysis.WildcardImports for a matching prefix and attribute accordingly,
and only return as local if no import match is found; this uses the existing
import parsing that populates analysis.Imports and analysis.WildcardImports and
preserves methodLookupName visibility within the callgraph package.

Comment on lines +761 to +771
// 3. Check wildcard imports — try each prefix
for _, prefix := range analysis.WildcardImports {
// Best-effort: assume the class comes from this wildcard import
// For crypto detection, this works well since crypto classes are typically
// imported via wildcards (import java.security.*)
return FunctionID{
Package: prefix,
Type: simpleClass,
Name: method,
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n internal/callgraph/java_parser.go | sed -n '740,790p'

Repository: scanoss/crypto-finder

Length of output: 1876


🏁 Script executed:

# Let's also see the function signature
cat -n internal/callgraph/java_parser.go | sed -n '700,800p'

Repository: scanoss/crypto-finder

Length of output: 3446


🏁 Script executed:

# Check the broader context and any tests
fd -e go | xargs grep -l "WildcardImports" | head -10

Repository: scanoss/crypto-finder

Length of output: 235


🏁 Script executed:

# Look for tests related to wildcard imports
cat -n internal/callgraph/java_parser_test.go | grep -A 20 -B 5 -i "wildcard"

Repository: scanoss/crypto-finder

Length of output: 4854


🏁 Script executed:

# Check if there are other callsites to resolveCallee
rg "resolveCallee" -A 3 -B 3

Repository: scanoss/crypto-finder

Length of output: 4565


🏁 Script executed:

# Search for how WildcardImports is populated
rg "WildcardImports" -B 5 -A 5

Repository: scanoss/crypto-finder

Length of output: 9764


🏁 Script executed:

# Check if there are other places where wildcard imports might be resolved differently
rg "WildcardImports\[" -B 2 -A 2

Repository: scanoss/crypto-finder

Length of output: 521


🏁 Script executed:

# Check if there's any documentation on intended behavior for multiple wildcards
rg "multiple.*wildcard|wildcard.*multiple" -i

Repository: scanoss/crypto-finder

Length of output: 211


Order-dependent wildcard import resolution pins unresolved classes to first matching prefix.

Lines 761-771 return immediately on the first wildcard import without checking whether the class actually exists in that package. With multiple wildcard imports, any unresolved Foo.bar() gets pinned to whichever prefix appeared first in the import list, making dependency attribution sensitive to source order and producing order-dependent call edges.

The fix should only use this wildcard fallback when exactly one wildcard import is present:

🔧 Safer fallback
-	for _, prefix := range analysis.WildcardImports {
-		// Best-effort: assume the class comes from this wildcard import
-		// For crypto detection, this works well since crypto classes are typically
-		// imported via wildcards (import java.security.*)
-		return FunctionID{
-			Package: prefix,
-			Type:    simpleClass,
-			Name:    method,
-		}
+	if len(analysis.WildcardImports) == 1 {
+		return FunctionID{
+			Package: analysis.WildcardImports[0],
+			Type:    simpleClass,
+			Name:    method,
+		}
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 3. Check wildcard imports — try each prefix
for _, prefix := range analysis.WildcardImports {
// Best-effort: assume the class comes from this wildcard import
// For crypto detection, this works well since crypto classes are typically
// imported via wildcards (import java.security.*)
return FunctionID{
Package: prefix,
Type: simpleClass,
Name: method,
}
}
// 3. Check wildcard imports — try each prefix
if len(analysis.WildcardImports) == 1 {
return FunctionID{
Package: analysis.WildcardImports[0],
Type: simpleClass,
Name: method,
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/callgraph/java_parser.go` around lines 761 - 771, The current loop
over analysis.WildcardImports returns a FunctionID for the first prefix
unconditionally, causing order-dependent resolution; change the behavior so that
you only construct and return the wildcard-based FunctionID when there is
exactly one wildcard import (check len(analysis.WildcardImports) == 1) and
otherwise do not return from the loop (let unresolved class remain unresolved).
Update the logic around the loop that currently returns a FunctionID with
Package: prefix, Type: simpleClass, Name: method so that it only does so for the
single-wildcard case, leaving resolution to other checks when there are multiple
wildcard imports.

@matiasdaloia matiasdaloia force-pushed the feature/mdaloia/dependency-scanning branch from 5025367 to f5a85ae Compare March 10, 2026 10:29
@github-actions
Copy link

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit f5a85ae

Files with similarities:

  • internal/callgraph/text_helpers.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 61
  • Detected files undeclared: 1
  • Detected files declared: 60
  • Licenses detected: 5
  • Licenses detected with copyleft: 3
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

Add resilient dependency resolution for multi-module Maven projects using
  a three-tier fallback strategy:

  - Tier 1: `mvn dependency:list --fail-never -DappendOutput=true` collects
    partial results from reactor builds where some modules fail
  - Tier 2: Per-module resolution (`-pl <module>`) isolates each module
    independently when the reactor yields zero results
  - Tier 3: `mvn install -DskipTests` builds modules locally to satisfy
    inter-module dependencies, then retries Tier 1

  Multi-module projects are detected from `<modules>` in the parent pom.xml.
  All modules are registered as WorkspaceMembers so they are treated as user
  code for call chain tracing. Added `--fail-never` to `dependency:tree` for
  consistent partial graph collection.

  Tested against eladmin (6-module project): went from 0 dependencies
  (complete failure) to 160 resolved (157 with sources) using Tier 1 alone.
@github-actions
Copy link

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit 4fe1a6f

Files with similarities:

  • internal/callgraph/text_helpers.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 61
  • Detected files undeclared: 1
  • Detected files declared: 60
  • Licenses detected: 5
  • Licenses detected with copyleft: 3
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

🔍 SCANOSS Code Similarity Detected

📄 1 snippet matches found

🔗 View detailed findings on commit 044b8b3

Files with similarities:

  • internal/callgraph/text_helpers.go

💡 Click the commit link above to see detailed annotations for each match.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 5
  • Undeclared components: 1
  • Declared components: 4
  • Detected files: 61
  • Detected files undeclared: 1
  • Detected files declared: 60
  • Licenses detected: 5
  • Licenses detected with copyleft: 3
  • Policies: ❌ 1 fail (1 total)

View more details on SCANOSS Action Summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant