Skip to content

feat(onnx): add DirectML/MIGraphX/CoreML execution provider support#119

Merged
raphaelsty merged 3 commits into
lightonai:mainfrom
noctrex:feat/onnx-execution-providers
Jun 8, 2026
Merged

feat(onnx): add DirectML/MIGraphX/CoreML execution provider support#119
raphaelsty merged 3 commits into
lightonai:mainfrom
noctrex:feat/onnx-execution-providers

Conversation

@noctrex

@noctrex noctrex commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Extend colgrep's ONNX Runtime integration beyond CUDA so the same code path can drive non-NVIDIA GPUs via DirectML (Windows), AMD GPUs via MIGraphX, and Apple Silicon via CoreML, selected by cargo feature. The active backend is now also surfaced in the user-facing Model line.

next-plaid-onnx/src/lib.rs

  • Add impl Display for ExecutionProvider with short canonical labels (CPU, CUDA, DirectML, CoreML, MIGraphX, TensorRT, auto) matching the tokens already used in onnx_runtime.rs download messages.
  • Add test_execution_provider_display alongside the existing EP tests.

colgrep/src/index/mod.rs

  • New cfg-gated IndexBuilder branch for directml/migraphx/coreml features that initializes the runtime and picks the matching ExecutionProvider. Falls through to the existing CPU branch when none of those features are enabled.
  • Searcher::Auto (two sites) extends its fallback chain past CoreML to also consider DirectML and MIGraphX before Cpu.
  • The Model: eprintln! now appends the selected backend in parentheses, e.g. "Model: lightonai/LateOn-Code (DirectML)". Provider is resolved at runtime so a CUDA build that falls back to CPU honestly prints CPU.

colgrep/src/onnx_runtime.rs

  • Drop the USE_GPU bool constant in favor of per-feature eprintln!s so each provider gets an accurate label at download time.
  • Route DirectML into the existing "gpu" cache subdir (shared with CUDA) and keep CPU in "cpu".
  • Add a NuGet download path for Microsoft.ML.OnnxRuntime.DirectML on win-x64, since the GitHub release artifacts do not ship DirectML. Other platforms/configs keep using GitHub releases unchanged.

Note: MIGraphX and CoreML are wired into provider selection but their runtime download still falls through to the CPU package; users on those platforms are expected to supply the matching ORT build out-of-band for now. DirectML is the only new provider with full auto-download support.

Verified:

  • cargo test -p next-plaid-onnx execution_provider (6 passed)
  • cargo build --release -p colgrep --features directml on Windows x86_64
  • cargo build --release -p colgrep (default features)
  • Runtime: colgrep prints "Model: lightonai/LateOn-Code (DirectML)" with --features directml

Disclaimer: This commit was authored with assistance from an LLM.

Extend colgrep's ONNX Runtime integration beyond CUDA so the same code
path can drive non-NVIDIA GPUs via DirectML (Windows), AMD GPUs via
MIGraphX, and Apple Silicon via CoreML, selected by cargo feature. The
active backend is now also surfaced in the user-facing Model line.

next-plaid-onnx/src/lib.rs
- Add impl Display for ExecutionProvider with short canonical labels
  (CPU, CUDA, DirectML, CoreML, MIGraphX, TensorRT, auto) matching the
  tokens already used in onnx_runtime.rs download messages.
- Add test_execution_provider_display alongside the existing EP tests.

colgrep/src/index/mod.rs
- New cfg-gated IndexBuilder branch for directml/migraphx/coreml
  features that initializes the runtime and picks the matching
  ExecutionProvider. Falls through to the existing CPU branch when
  none of those features are enabled.
- Searcher::Auto (two sites) extends its fallback chain past CoreML to
  also consider DirectML and MIGraphX before Cpu.
- The Model: eprintln! now appends the selected backend in parentheses,
  e.g. "Model: lightonai/LateOn-Code (DirectML)". Provider is resolved
  at runtime so a CUDA build that falls back to CPU honestly prints CPU.

colgrep/src/onnx_runtime.rs
- Drop the USE_GPU bool constant in favor of per-feature eprintln!s
  so each provider gets an accurate label at download time.
- Route DirectML into the existing "gpu" cache subdir (shared with
  CUDA) and keep CPU in "cpu".
- Add a NuGet download path for
  Microsoft.ML.OnnxRuntime.DirectML on win-x64, since the GitHub
  release artifacts do not ship DirectML. Other platforms/configs
  keep using GitHub releases unchanged.

Note: MIGraphX and CoreML are wired into provider selection but their
runtime download still falls through to the CPU package; users on
those platforms are expected to supply the matching ORT build
out-of-band for now. DirectML is the only new provider with full
auto-download support.

Verified:
- cargo test -p next-plaid-onnx execution_provider (6 passed)
- cargo build --release -p colgrep --features directml on Windows x86_64
- cargo build --release -p colgrep (default features)
- Runtime: colgrep prints "Model: lightonai/LateOn-Code (DirectML)"
  with --features directml

Disclaimer: This commit was authored with assistance from an LLM.
@noctrex noctrex force-pushed the feat/onnx-execution-providers branch from c8bf28b to 499df8a Compare June 4, 2026 19:55
@aussetg

aussetg commented Jun 5, 2026

Copy link
Copy Markdown

Did you bench each backend ? In my own tests, Migraphx is slow ( see #122 ). But I can't test the rest.

@noctrex

noctrex commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

@aussetg I am on Windows 11, with a AMD Ryzen 5800X3D CPU and a AMD RX 7900XTX GPU.
I just tried to index this very own codebase using the model LateOn-Code, using the DirectML backend to utilize the GPU.
Only this backend is available in Windows for AMD.

DirectML: 70,87 seconds
CPU: 643,25 seconds

Without this PR, it would not use DirectML at all and was always using only CPU.

@raphaelsty raphaelsty left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing MR thank you 😊

Fixes the Format / Colgrep Crate Format CI jobs. Pure rustfmt
output: wrap long #[cfg] attributes, collapse single-element
vec!, and re-indent the get_download_info block. No logic change.
The CoreML branch in configure_auto_provider was the only execution
provider missing the `if !force_cpu` guard, so is_force_cpu() was
silently ignored on macOS and force_cpu went unread under a
coreml-only build (failing clippy -D warnings). Add the guard so
CoreML respects the override like CUDA/TensorRT/DirectML/MIGraphX.
@raphaelsty raphaelsty merged commit 54da546 into lightonai:main Jun 8, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants