Skip to content

qwen3_5_moe: add OpenAI serving entrypoint #14116

qwen3_5_moe: add OpenAI serving entrypoint

qwen3_5_moe: add OpenAI serving entrypoint #14116

Triggered via pull request June 18, 2026 15:18
Status Success
Total duration 1h 21m 30s
Artifacts 17

cuda.yml

on: pull_request
Get changed files  /  get-changed-files
4s
Get changed files / get-changed-files
CI run decision  /  decide
32s
CI run decision / decide
Matrix: export-model-cuda-artifact
Matrix: test-cuda-builds
test-models-cuda  /  linux-job
47m 19s
test-models-cuda / linux-job
unittest-cuda  /  linux-job
53m 10s
unittest-cuda / linux-job
Matrix: test-cuda-pybind
Matrix: test-model-cuda-e2e
check-all-cuda-builds
4s
check-all-cuda-builds
Fit to window
Zoom out
Zoom in

Annotations

128 errors and 40 warnings
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
CI run decision / decide
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@v4. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-executorch-cuda-build-13.0 / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-executorch-cuda-build-12.6 / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-models-cuda / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
unittest-cuda / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (qwen3-0.6b, Qwen-Qwen3-0.6B-cuda-non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (qwen3-0.6b, --quantize, Qwen-Qwen3-0.6B-cuda-quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (gemma3-4b, --quantize, google-gemma-3-4b-it-cuda-quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Artifacts

Produced during runtime
Name Size Digest
Qwen-Qwen3-0.6B-cuda-non-quantized
1.1 GB
sha256:49613c32b06ac2e6b5068de26937fd443fc9693137b8ac2e33a2d36ac5594c90
Qwen-Qwen3-0.6B-cuda-quantized-int4-tile-packed
559 MB
sha256:039876f131b6757b4f63ce93f129bf9e7bfbc2fdfd1acde15122deef5048a63d
SocialLocalMobile-Qwen3.5-35B-A3B-HQQ-INT4-cuda-quantized-int4-tile-packed
14.6 GB
sha256:dafd16b022a30cd19bea7a5fbc21bcc933fcf52f8727ccab6676431ada54950d
facebook-dinov2-small-imagenet1k-1-layer-cuda-non-quantized
34.7 MB
sha256:bd4cdf88c853b573821b3958947069cf0007b7f62bae59322af1187a7cb091ea
google-gemma-3-4b-it-cuda-non-quantized
7.22 GB
sha256:873c822270b8a2269887730a7f601b36b29ec886ccf425ac05a0b0343881ce29
google-gemma-3-4b-it-cuda-quantized-int4-tile-packed
3.4 GB
sha256:3fd6c338a98ca0385883d3f435c11f65cb24406f0ed540b196ab4374ad8f0426
mistralai-Voxtral-Mini-3B-2507-cuda-non-quantized
6.82 GB
sha256:58f17e8a0b69285ed522867a5f734f98a2ce98abc92dd3d2ae8390443f7c60ae
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-tile-packed
2.88 GB
sha256:985b941d65c5d368e68c2cc44505827056135987164d0b6f4e8b9472c79c39d3
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-weight-only
6.2 GB
sha256:24d1e7af52d5c28dcec263fc972cdc8433367f26821acf3d45a4583140366a4f
mistralai-Voxtral-Mini-4B-Realtime-2602-cuda-quantized-int4-tile-packed
2.71 GB
sha256:ce1f74d0eea46b5c1a78766e384c73048ffccd39cd2e2c208a89697fad8297c9
nvidia-diar_streaming_sortformer_4spk-v2-cuda-non-quantized
436 MB
sha256:61e1a23a3c73abf3f26d9f71764cbed81431b0cc796d5d7764597570ad039e41
nvidia-parakeet-tdt-cuda-non-quantized
952 MB
sha256:81642107be6b08e39e142aed48b118e2d997f285a4e44df21bf223f03212b348
nvidia-parakeet-tdt-cuda-quantized-int4-tile-packed
443 MB
sha256:6a485dcd18efef6d7bb24e29b8979ffe5591da7510866fd411eb6c77e7b0d995
openai-whisper-large-v3-turbo-cuda-quantized-int4-tile-packed
491 MB
sha256:95852a91ed5d8fd4e9c0e2366a446066401e9987e8936cfe45cf695267d04c82
openai-whisper-large-v3-turbo-cuda-quantized-int4-weight-only
570 MB
sha256:66a106ef5f8f5e4c1001f4df7a7b256112a986f343c4a447bcab777deb987293
openai-whisper-small-cuda-non-quantized
362 MB
sha256:4fc580b241ce2111e1c2f3245ea0209904a5fccb9f7bebf6597b07f467cea77a
unsloth-gemma-4-31B-it-GGUF-cuda-quantized-int4-tile-packed
18.8 GB
sha256:01b589f1e74454bbc97e5506f0b5568ad274e6820b9de48546b86d72c1b4c841