Skip to content

BLD: Refine cuda build#156

Merged
codingl2k1 merged 7 commits into
mainfrom
bld/refine_cuda_build
Jun 11, 2026
Merged

BLD: Refine cuda build#156
codingl2k1 merged 7 commits into
mainfrom
bld/refine_cuda_build

Conversation

@codingl2k1

Copy link
Copy Markdown

Fixes: #125 #118

Removed the detect_cuda_architectures function and updated CUDA architecture handling to default to 'all' if not set in the environment.
Updated delvewheel repair command to include CUDA path adjustments for Windows.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies the build script by removing the custom detect_cuda_architectures function and defaulting to CMake's "all" keyword when CUDA_ARCHITECTURES is not set. The reviewer suggests using "native" instead of "all" as a fallback for local builds to avoid excessively long compilation times and resource usage by targeting only the host's GPU architecture.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread scripts/build.py Outdated
codingl2k1 and others added 4 commits June 11, 2026 11:37
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Update fallback behavior for CUDA architectures in build script.
Updated CUDA requirements for Linux and Windows, added CUDA GPU architecture coverage section with details on supported architectures and compatibility.
Enhanced error handling for network issues during file downloads.
@codingl2k1 codingl2k1 merged commit 0d523bd into main Jun 11, 2026
13 of 14 checks passed
@iwr-redmond

Copy link
Copy Markdown

Great idea 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Windows 10 + Xinference + llama.cpp + GTX 1060: Model crashes during warmup (ServerClosed) after switching to xllamacpp for GPU

2 participants