Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions .github/workflows/banned-files.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

name: banned-files

on:
pull_request:
types: [opened, synchronize, reopened]

permissions:
contents: read

jobs:
check:
runs-on: ubuntu-latest
timeout-minutes: 2
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Check for banned files
run: |
# Banned file patterns and reasons.
# Add new entries here — one pattern per line, tab-separated.
BANNED_PATTERNS=(
'.env' 'May contain API keys or secrets'
'.env.*' 'May contain API keys or secrets'
'.envrc' 'direnv secrets file'
'*.pem' 'Private key or certificate'
'*.key' 'Private key'
'*.p12' 'PKCS#12 keystore'
'*.pfx' 'PKCS#12 keystore'
'*.jks' 'Java keystore'
'*.keystore' 'Keystore file'
'credentials.json' 'Cloud provider credentials'
'service-account*.json' 'Cloud provider service account'
'.npmrc' 'May contain registry tokens'
'.netrc' 'May contain login credentials'
'*_rsa' 'SSH private key'
'*_ed25519' 'SSH private key'
'*_ecdsa' 'SSH private key'
'.DS_Store' 'macOS Finder metadata'
'Thumbs.db' 'Windows Explorer metadata'
'desktop.ini' 'Windows Explorer metadata'
'*.pyc' 'Python bytecode'
)
Comment on lines +27 to +48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify deny-list coverage and matching strategy
rg -n --type=yaml 'node_modules|dist|__pycache__|basename=\$\(basename "\$file"\)|\[\[ "\$basename" == \$pattern \]\]' .github/workflows/banned-files.yaml

Repository: NVIDIA/NemoClaw

Length of output: 160


🏁 Script executed:

cat -n .github/workflows/banned-files.yaml

Repository: NVIDIA/NemoClaw

Length of output: 4429


Directory build artifacts should be explicitly blocked in the deny list with full-path matching.

The BANNED_PATTERNS list is missing node_modules/, dist/, and __pycache__/ entries. Additionally, line 79 only checks the basename, which prevents path-based patterns from working. A directory artifact like node_modules/package/index.js would not be caught since the basename check only sees index.js.

Add the missing directory patterns and update the matching logic to check the full file path:

Suggested fix
           BANNED_PATTERNS=(
             '*.pyc'                   'Python bytecode'
+            'node_modules/*'          'Dependency vendor directory'
+            'dist/*'                  'Build artifact directory'
+            '__pycache__/*'           'Python cache directory'
           )
             basename=$(basename "$file")
-            if [[ "$basename" == $pattern ]]; then
+            if [[ "$file" == $pattern || "$basename" == $pattern ]]; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/banned-files.yaml around lines 27 - 48, Add explicit
directory patterns 'node_modules/', 'dist/', and '__pycache__/' to the
BANNED_PATTERNS array and change the matching logic that currently tests only
the basename to test the full file path; specifically, update the
BANNED_PATTERNS declaration (variable name BANNED_PATTERNS) to include those
three entries and replace any basename-based check with a path-based comparison
(e.g., use the full filepath variable in a shell pattern match or grep against
"$file" rather than basename("$file")) so patterns like
node_modules/package/index.js are caught.


# Paths exempt from the check (test fixtures, etc.)
EXEMPT_DIRS="^(test/fixtures/|testdata/)"

# Get files changed in this PR
CHANGED=$(git diff --name-only --diff-filter=ACR "${{ github.event.pull_request.base.sha }}" HEAD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify current diff-filter usage in workflow
rg -n --type=yaml 'git diff --name-only --diff-filter=' .github/workflows/banned-files.yaml

Repository: NVIDIA/NemoClaw

Length of output: 170


🏁 Script executed:

cat -n .github/workflows/banned-files.yaml | head -80

Repository: NVIDIA/NemoClaw

Length of output: 3596


🏁 Script executed:

cat -n .github/workflows/banned-files.yaml | tail -30

Repository: NVIDIA/NemoClaw

Length of output: 1364


🏁 Script executed:

# Check if there are any banned files already tracked in the repository
git ls-files | grep -E '\.(env|pem|key|p12|pfx|jks|keystore)$|credentials\.json|service-account.*\.json|\.npmrc|\.netrc|_rsa|_ed25519|_ecdsa|\.DS_Store|Thumbs\.db|desktop\.ini|\.pyc' | head -20

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Check the git diff behavior difference between HEAD and head.sha in the context
# Let's also verify how the checkout action handles the ref
git rev-parse HEAD
git log --oneline -1

Repository: NVIDIA/NemoClaw

Length of output: 179


🏁 Script executed:

# Let's verify what the diff-filter options mean and check the actual behavior
# in a PR context. Let's also see if there are any comments or docs about this choice
git log -p --follow -S 'diff-filter=ACR' -- .github/workflows/banned-files.yaml | head -100

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Let's check if there are any other workflow files or documentation that explain the design decision
find .github -type f -name "*.yaml" -o -name "*.yml" | xargs grep -l "diff-filter" 2>/dev/null

Repository: NVIDIA/NemoClaw

Length of output: 94


Include modified files in the scan set.

Line 54 uses --diff-filter=ACR, which excludes M (Modified). If a banned file were already tracked in the repository (either accidentally or from before this check existed), modifications to it would bypass this check. Include M in the filter to catch all changes, including modifications to banned files.

Additionally, replace HEAD with ${{ github.event.pull_request.head.sha }} for more explicit and reliable SHA references in the PR context.

Suggested fix
-          CHANGED=$(git diff --name-only --diff-filter=ACR "${{ github.event.pull_request.base.sha }}" HEAD)
+          CHANGED=$(git diff --name-only --diff-filter=ACMR "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CHANGED=$(git diff --name-only --diff-filter=ACR "${{ github.event.pull_request.base.sha }}" HEAD)
CHANGED=$(git diff --name-only --diff-filter=ACMR "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/banned-files.yaml at line 54, Update the git diff command
assigned to CHANGED: include 'M' in the --diff-filter (so use ACRM or AC R M
combined) to ensure modified files are scanned, and replace the HEAD reference
with the explicit PR head SHA (${{
github.event.pull_request.head.sha }}) so the command uses the PR's head commit;
update the line with CHANGED=$(git diff --name-only --diff-filter=... "${{
github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha
}}") accordingly.


if [ -z "$CHANGED" ]; then
echo "No changed files to check."
exit 0
fi

FAILED=0
VIOLATIONS=""

i=0
while [ $i -lt ${#BANNED_PATTERNS[@]} ]; do
pattern="${BANNED_PATTERNS[$i]}"
reason="${BANNED_PATTERNS[$((i+1))]}"
i=$((i+2))

# Convert glob to grep pattern: *.ext -> \.ext$, exact -> /exact$|^exact$
while IFS= read -r file; do
# Skip exempt paths
if echo "$file" | grep -qE "$EXEMPT_DIRS"; then
continue
fi

# Match the basename against the glob pattern
basename=$(basename "$file")
if [[ "$basename" == $pattern ]]; then
VIOLATIONS="${VIOLATIONS}\n ❌ ${file} — ${reason}"
FAILED=1
fi
done <<< "$CHANGED"
done

if [ "$FAILED" -eq 1 ]; then
echo ""
echo "============================================"
echo " Banned files detected in this PR"
echo "============================================"
echo -e "$VIOLATIONS"
echo ""
echo "These files should not be committed to the repository."
echo "If this is a false positive (e.g., a test fixture),"
echo "place the file under test/fixtures/ or testdata/."
echo ""
exit 1
fi

echo "✅ No banned files found."
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ draft_newsletter_*
*_ed25519
*_ecdsa
credentials.json
service-account*.json
Thumbs.db
desktop.ini

# Security: prevent accidental commit of disclosure drafts
DRAFT-*.md
Expand Down