Skip to content

[cuda backend] splitk turboquant sdpa for decode #7148

[cuda backend] splitk turboquant sdpa for decode

[cuda backend] splitk turboquant sdpa for decode #7148

Triggered via pull request June 18, 2026 17:32
Status Success
Total duration 1h 18m 0s
Artifacts 6

cuda-windows.yml

on: pull_request
Get changed files  /  get-changed-files
3s
Get changed files / get-changed-files
CI run decision  /  decide
32s
CI run decision / decide
Matrix: export-model-cuda-windows-artifact
Matrix: test-model-cuda-windows-e2e
Fit to window
Zoom out
Zoom in

Annotations

13 warnings
CI run decision / decide
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@v4. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (nvidia, parakeet-tdt, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile... / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-windows-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (nvidia, parakeet-tdt, quantized-int4-weight-only) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-windows-e2e (nvidia, parakeet-tdt, non-quantized) / windows-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Artifacts

Produced during runtime
Name Size Digest
facebook-dinov2-small-imagenet1k-1-layer-cuda-windows-non-quantized
35.5 MB
sha256:3945e12d6ff3eeacbb5859eec13541e40c704dee90e8a5a0cca65433d2247810
mistralai-Voxtral-Mini-3B-2507-cuda-windows-non-quantized
6.82 GB
sha256:037a95c68689c80409153b0ff5c5cf4c852797e76eaa3e7358423f060873f29b
mistralai-Voxtral-Mini-3B-2507-cuda-windows-quantized-int4-weight-only
6.2 GB
sha256:1c46e2f6e8e57ba85cefbe1f7e6e207d823cb77dc8bd34bb4c96cd4787de4db0
mistralai-Voxtral-Mini-4B-Realtime-2602-cuda-windows-quantized-int4-tile-packed
2.72 GB
sha256:43034c405f56b9f9c356c67c10cc6ccb95ef201b737fcd3ee8673cff40f2652f
nvidia-parakeet-tdt-cuda-windows-non-quantized
954 MB
sha256:bd6cd60fad29b7822d85c933ec5502f3b1e101744f9bfdeaa3ab39368452db2a
nvidia-parakeet-tdt-cuda-windows-quantized-int4-weight-only
436 MB
sha256:3d0566b13d02b40a72d71a8d665dc413d9870373283cb940d11d601d52213367