Updated to new CUDA version by ljn7 · Pull Request #103 · HawkAaron/warp-transducer

ljn7 · 2025-07-09T11:05:44Z

This PR updates the PyTorch binding and CMake build setup to support newer CUDA versions (e.g., CUDA 12.9).

Changes:

Updated pytorch_binding/setup.py to work with recent CUDA installations.
Added support for setting CFLAGS, LDFLAGS, and CUDA_HOME manually when using newer CUDA directory structures.
Updated README with troubleshooting instructions.
Ensured compatibility with Python 3.12 and PyTorch compiled for newer environments.

Pytorch gpu allocator

…rp-transducer) - Adapted and updated for compatibility and enhancements on my fork - Updated relevant documentation and options - Updated CMakeLists.txt to support CUDA and CMake version selection Updated CMakeLists.txt to support CUDA and CMake version detection/selection

Detect CUDA to on/off WITH_GPU by default

- Check CUDA availability before declaring it as project language - Fall back to CPU-only build when CUDA toolkit not found - Prevents "Failed to find nvcc" error on systems without CUDA - Maintains GPU support when CUDA is properly installed

…atform checks - Enabled robust detection of CUDA and platform compatibility - Improved build logic to support CUDA builds on Windows - Added fallback mechanisms and clearer error handling - Ensured smoother CPU-only builds without redundant checks - Reduced platform-specific pain points for easier maintenance

- Added robust detection of CUDA toolkit using CUDAToolkit and CUDA packages - Introduced WITH_GPU option to allow user-controlled GPU builds - Automatically fallback to CPU-only build if CUDA is unavailable or WITH_GPU=OFF - Improved status messages to reflect user intent and system CUDA availability - Added support for CUDA architecture detection and override via CMAKE_CUDA_ARCHITECTURES - Improved build summary output for clarity

- Uses cmake and make to build and run test_cpu - Targets only CPU environments for faster and simpler CI

- Updated for compatability and performance similar to b-flo's implementation - Removed tensorflow and mxnet bindings (might add as git-modules)

ljn7 and others added 13 commits July 8, 2025 11:00

Updated for CUDA 11 and 12

15422e1

Update README.md

5cbc6d6

Changed to PyTorch CUDA allocator

a1de641

Requires C++17 for torch 2.1.0 and above and requires packaging

2806d2d

Fixed test file for GPU testing

64477ae

Update README.md

a11cedd

pytorch-gpu-allocator

99deacb

Pytorch gpu allocator

Update CMakeLists.txt

d4bdce5

Detect CUDA to on/off WITH_GPU by default

Added fastemit argument

80f8979

ljn7 force-pushed the master branch 2 times, most recently from 1f14512 to a6f8c22 Compare July 26, 2025 15:32

ljn7 and others added 2 commits July 26, 2025 21:09

ci: add CPU-only GitHub Actions workflow for build and test

cd86c03

- Uses cmake and make to build and run test_cpu - Targets only CPU environments for faster and simpler CI

Updated CMakeLists

4843413

- Updated for compatability and performance similar to b-flo's implementation - Removed tensorflow and mxnet bindings (might add as git-modules)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated to new CUDA version#103

Updated to new CUDA version#103
ljn7 wants to merge 15 commits into
HawkAaron:masterfrom
ljn7:master

ljn7 commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ljn7 commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant