Skip to content

fix: update ribodetector GPU container to modern PyTorch/CUDA#11197

Draft
pinin4fjords wants to merge 1 commit intomasterfrom
fix/ribodetector-gpu-cuda12
Draft

fix: update ribodetector GPU container to modern PyTorch/CUDA#11197
pinin4fjords wants to merge 1 commit intomasterfrom
fix/ribodetector-gpu-cuda12

Conversation

@pinin4fjords
Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords commented Apr 15, 2026

Summary

  • Update ribodetector GPU container from PyTorch 1.11.0 (CUDA 11.1, March 2022) to PyTorch 2.10.0 (CUDA 12.9)
  • Pin cuda-version>=12,<13 in environment.gpu.yml to keep the solver within supported CUDA versions
  • Add containers section to meta.yml with all platform variants (amd64, arm64, CUDA 12, CUDA 11.8)

Context

The old GPU container used PyTorch 1.11.0 because it was the last version whose conda dependencies didn't require the __cuda virtual package, which is absent on Wave's GPU-less build servers. Newer pytorch-gpu versions all fail to solve without CONDA_OVERRIDE_CUDA. Wave now handles this automatically via a two-pass solve: if the first attempt fails with __cuda missing, it retries with CONDA_OVERRIDE_CUDA set (seqeralabs/wave#1027). PyTorch 1.11.0 was the last version that didn't require the __cuda virtual package during the conda solve.

With Wave's fix, we can now build containers with current PyTorch. The CUDA 12.x container is the default (covers any NVIDIA driver supporting CUDA 12.0+). A CUDA 11.8 alternative is recorded in meta.yml for users with older drivers.

RFC: GPU container variants in meta.yml

GPU containers are tied to a CUDA major version (full forward compat within each, so only a couple of variants matter). This PR proposes extending the existing containers block (see fastqc, multiqc) with CUDA-versioned platform keys so that pipeline developers have pre-built URIs documented when they need to offer users a choice:

containers:
  docker:
    linux/amd64: ...
    linux/arm64: ...
    linux/amd64+cuda12: ...    # default GPU container
    linux/amd64+cuda11: ...    # alternative for older drivers
  singularity:
    linux/amd64: ...
    linux/arm64: ...
    linux/amd64+cuda12: ...
    linux/amd64+cuda11: ...

The +cuda12/+cuda11 suffix convention is new. Open to feedback on the naming.

Related

@pinin4fjords pinin4fjords force-pushed the fix/ribodetector-gpu-cuda12 branch 4 times, most recently from 3897f66 to 81405b5 Compare April 16, 2026 11:11
@pinin4fjords pinin4fjords changed the title fix: update ribodetector GPU containers to CUDA 12.x fix: update ribodetector GPU container to modern PyTorch/CUDA Apr 16, 2026
- Update GPU container from PyTorch 1.11.0 (CUDA 11.1) to PyTorch
  2.10.0 (CUDA 12.9). Pin cuda-version>=12,<13 in environment.gpu.yml.
- Add containers section to meta.yml with CUDA 12.x (default) and
  CUDA 11.8 alternatives following the existing platform key convention.
- Built via Wave v2 template (seqeralabs/wave#1027).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant