Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion worker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ COPY worker/agent/ ./
RUN CGO_ENABLED=0 GOOS=linux go build -trimpath -ldflags="-s -w" \
-o /out/scaleplex-agent ./...

FROM nvidia/cuda:12.6.3-runtime-ubuntu24.04
FROM nvidia/cuda:13.2.1-runtime-ubuntu24.04
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read-only repo reconnaissance: verify whether a CUDA/NVENC smoke check already exists.
# Expectation: a workflow/job/script explicitly validates CUDA hw init or NVENC path.

set -euo pipefail

echo "== Search workflows/scripts for CUDA/NVENC smoke checks =="
rg -n -C2 --iglob '*.yml' --iglob '*.yaml' --iglob '*.sh' \
  'init_hw_device|hwaccel|nvenc|nvdec|nvidia-smi|NVIDIA_DRIVER_CAPABILITIES|/dev/nvidia0|ffmpeg.*cuda'

echo
echo "== Search Dockerfile/worker docs for declared verification steps =="
rg -n -C2 'cuda|nvenc|nvdec|smoke|validation|compat' worker/Dockerfile worker/ README.md

Repository: Varashi/scaleplex

Length of output: 50374


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check CI/workflows for CUDA/NVENC smoke checks =="
if [ -d .github ]; then
  rg -n --hidden --glob '!.git/*' \
    '(init_hw_device|hwaccel|nvenc|nvdec|nvidia-smi|NVIDIA_DRIVER_CAPABILITIES|/dev/nvidia0|ffmpeg.*cuda)' \
    .github/workflows .github/actions .github/scripts 2>/dev/null || true
else
  echo "No .github directory"
fi

echo
echo "== Search for nvidia-smi / hwdevice init across repo (trimmed) =="
rg -n --hidden --glob '!.git/*' \
  '(nvidia-smi|init_hw_device|filter_hw_device|hwaccel_output_format|tonemap_cuda|scale_cuda)' \
  --glob '!worker/agent/testdata/**' --glob '!worker/agent/testdata/*.json' \
  --glob '!worker/agent/testdata/**/**.json' .

echo
echo "== Inspect worker/Dockerfile header comments around lines 10-20 =="
nl -ba worker/Dockerfile | sed -n '1,60p'

echo
echo "== Inspect NVENC-related unit tests that mention -init_hw_device =="
rg -n --glob 'worker/agent/*test*.go' \
  'init_hw_device|filter_hw_device|scale_cuda|tonemap_cuda' worker/agent | head -n 200

Repository: Varashi/scaleplex

Length of output: 50435


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== List common CI entrypoints =="
ls -la .github 2>/dev/null || echo "no .github dir"
ls -la . 2>/dev/null | head -n 200
find . -maxdepth 3 -type f \( -name "*.yml" -o -name "*.yaml" \) \
  \( -path "./.github/*" -o -name "Dockerfile" -o -path "./deploy/*" -o -path "./scripts/*" \) 2>/dev/null | head -n 200

echo
echo "== Show GitHub workflows files (if any) =="
if [ -d .github/workflows ]; then
  find .github/workflows -type f -maxdepth 2 \( -name "*.yml" -o -name "*.yaml" \) | sed -n '1,200p'
else
  echo "no .github/workflows"
fi

echo
echo "== Search workflows/scripts for CUDA/NVENC smoke hooks (GPU runtime check keywords) =="
rg -n --hidden --glob '!.git/*' \
  '(nvidia-smi|NVIDIA_DRIVER_CAPABILITIES|/dev/nvidia0|ffmpeg.*cuda|init_hw_device.*cuda|filter_hw_device.*cuda|hwaccel_output_format:0 *cuda)' \
  .github/workflows .github/actions .github/scripts 2>/dev/null || true

echo
echo "== Inspect worker/Dockerfile around header + FROM lines =="
sed -n '1,80p' worker/Dockerfile | cat -n | sed -n '1,80p'

echo
echo "== Confirm existing CUDA-related tests (unit vs integration) =="
rg -n --glob 'worker/agent/*test*.go' 'init_hw_device.+cuda|filter_hw_device.+cuda|scale_cuda|tonemap_cuda' worker/agent \
  | head -n 200

Repository: Varashi/scaleplex

Length of output: 12440


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search within workflows for ffmpeg/cuda/nvenc/nvidia keywords =="
rg -n --glob '*.yml' --glob '*.yaml' \
  '(ffmpeg.*cuda|init_hw_device|filter_hw_device.*cuda|nvenc|nvdec|nvidia-smi|/dev/nvidia0|NVIDIA_VISIBLE_DEVICES|NVIDIA_DRIVER_CAPABILITIES|hwaccel_output_format:0\s*cuda)' \
  .github/workflows

echo
echo "== Inspect docker-compose-smoke workflow (top 200 lines) =="
sed -n '1,220p' .github/workflows/docker-compose-smoke.yaml | cat -n

echo
echo "== Inspect build-worker workflow (top 200 lines) =="
sed -n '1,220p' .github/workflows/build-worker.yaml | cat -n

echo
echo "== Inspect build-ffmpeg workflow (top 220 lines) =="
sed -n '1,220p' .github/workflows/build-ffmpeg.yaml | cat -n

Repository: Varashi/scaleplex

Length of output: 129


🏁 Script executed:

#!/bin/bash
set -euo pipefail

for f in \
  .github/workflows/docker-compose-smoke.yaml \
  .github/workflows/build-worker.yaml \
  .github/workflows/build-orchestrator.yaml \
  .github/workflows/build-shim.yaml \
  .github/workflows/build-ffmpeg.yaml \
  .github/workflows/build-ffmpeg-deps.yaml
do
  echo "===== $f (line count) ====="
  wc -l "$f" || true
  echo
  echo "===== $f (content) ====="
  sed -n '1,260p' "$f" | cat -n
  echo
done

echo "== Search for container/GPU smoke signals across workflows (looser patterns) =="
rg -n --glob '*.yml' --glob '*.yaml' \
  '(docker\s+compose|compose|up -d|nvidia|NVIDIA|gpu|cud(a|nn)|cuda|nvidia-smi|/dev/nvidia|ffmpeg|ffprobe|smoke|smoketest|smoke-test|hwaccel|nvenc|nvdec)' \
  .github/workflows

Repository: Varashi/scaleplex

Length of output: 24574


Add CUDA device-init smoke coverage for the CUDA runtime base bump

FROM nvidia/cuda:13.2.1-runtime-ubuntu24.04
  • CI currently doesn’t exercise CUDA hw init: build-worker.yaml runs replay with REPLAY_NO_FFMPEG=1 (rewriter-only), and docker-compose-smoke.yaml omits the worker because GH runners lack GPU passthrough, so ffmpeg -init_hw_device cuda=... / CUDA surface init isn’t hit.
  • Add a CI step on a self-hosted GPU runner that runs a minimal in-container ffmpeg command reaching the CUDA path (e.g. includes -init_hw_device cuda=... and a CUDA filter like scale_cuda or tonemap_cuda) and fails on init errors.

Fix stale Dockerfile header base version

  • worker/Dockerfile header comment still says nvidia/cuda:12.6.3-runtime-ubuntu24.04 while the image now uses nvidia/cuda:13.2.1-runtime-ubuntu24.04. Update the header text to match.
πŸ€– Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@worker/Dockerfile` at line 33, The Dockerfile comment header is stale and CI
lacks coverage for CUDA runtime init after bumping FROM in worker/Dockerfile to
nvidia/cuda:13.2.1-runtime-ubuntu24.04; update the top-of-file header text to
reflect the new base image and add a CI smoke step on a self-hosted GPU runner
(e.g., in build-worker.yaml) that runs a minimal in-container ffmpeg invocation
using -init_hw_device cuda=... plus a CUDA filter such as scale_cuda or
tonemap_cuda and fails on nonzero exit so CUDA hw init is exercised (ensure the
step does not set REPLAY_NO_FFMPEG=1 and add a short docker-compose or docker
run that launches the worker image to execute the ffmpeg command); this will
detect init errors with the new CUDA runtime while keeping existing non-GPU CI
unchanged (docker-compose-smoke.yaml can remain without the worker for GH
runners).

LABEL org.opencontainers.image.source=https://github.com/Varashi/scaleplex
LABEL org.opencontainers.image.description="scaleplex worker β€” scaleplex-ffmpeg + Intel VAAPI + NVIDIA NVENC + Go agent"
LABEL org.opencontainers.image.licenses=MIT
Expand Down
Loading