Skip to content

[MLX] Reduce physical footprint memory in RingBufferKVCache for chunk… #14105

[MLX] Reduce physical footprint memory in RingBufferKVCache for chunk…

[MLX] Reduce physical footprint memory in RingBufferKVCache for chunk… #14105

Triggered via push June 17, 2026 23:42
Status Success
Total duration 1h 20m 57s
Artifacts 17

cuda.yml

on: push
Get changed files  /  get-changed-files
3s
Get changed files / get-changed-files
CI run decision  /  decide
32s
CI run decision / decide
Matrix: export-model-cuda-artifact
Matrix: test-cuda-builds
test-models-cuda  /  linux-job
46m 21s
test-models-cuda / linux-job
unittest-cuda  /  linux-job
52m 36s
unittest-cuda / linux-job
Matrix: test-cuda-pybind
Matrix: test-model-cuda-e2e
check-all-cuda-builds
3s
check-all-cuda-builds
Fit to window
Zoom out
Zoom in

Annotations

132 errors and 40 warnings
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Process completed with exit code 1.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
No names found, cannot describe anything.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Executing the custom container implementation failed. Please contact your self hosted runner administrator.
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
[OSDC] Step script exited with code 1. This is a script/workflow error, not an infrastructure issue. Check the step logs above for the actual failure.
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Not authorized to perform sts:AssumeRoleWithWebIdentity
CI run decision / decide
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@v4. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (Qwen, Qwen3-0.6B, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (Qwen, Qwen3-0.6B, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-executorch-cuda-build-13.0 / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-executorch-cuda-build-12.6 / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, parakeet-tdt, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-models-cuda / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
export-model-cuda-artifact (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
unittest-cuda / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, parakeet-tdt, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, diar_streaming_sortformer_4spk-v2, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (facebook, dinov2-small-imagenet1k-1-layer, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (nvidia, parakeet-tdt, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (google, gemma-3-4b-it, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (unsloth, gemma-4-31B-it-GGUF, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (SocialLocalMobile, Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (qwen3-0.6b, Qwen-Qwen3-0.6B-cuda-non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (qwen3-0.6b, --quantize, Qwen-Qwen3-0.6B-cuda-quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-cuda-pybind (gemma3-4b, --quantize, google-gemma-3-4b-it-cuda-quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: ./test-infra/.github/actions/setup-ssh, actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
test-model-cuda-e2e (mistralai, Voxtral-Mini-4B-Realtime-2602, quantized-int4-tile-packed) / linux-job
Node.js 20 is deprecated. The following actions target Node.js 20 but are being forced to run on Node.js 24: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683, actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093, aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838, pmeier/pytest-results-action@a2c1430e2bddadbad9f49a6f9b879f062c6b19b1. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Artifacts

Produced during runtime
Name Size Digest
Qwen-Qwen3-0.6B-cuda-non-quantized
1.1 GB
sha256:c09a4da0581d09fc4da45055ffc53125ec928951f56b657b78128bf556b5221b
Qwen-Qwen3-0.6B-cuda-quantized-int4-tile-packed
559 MB
sha256:2d40d133ece349bcab56989b5dac982767e485a2a0e0d4dfec98ee24b99f557d
SocialLocalMobile-Qwen3.5-35B-A3B-HQQ-INT4-cuda-quantized-int4-tile-packed
14.6 GB
sha256:c4f3ab1e59e2d7d0df17b07d3b64f6a7aa8f34fbc8404c8a0a62f9e0fc273ffe
facebook-dinov2-small-imagenet1k-1-layer-cuda-non-quantized
34.6 MB
sha256:6caa14e46e0aad0cb79f8b12865715380a1a26f3c7a1ab692d24ad343562b199
google-gemma-3-4b-it-cuda-non-quantized
7.22 GB
sha256:13d730d530091eaaa7cf9b23a61aa42209c516407c03fa0c0c26c22d5bc62337
google-gemma-3-4b-it-cuda-quantized-int4-tile-packed
3.4 GB
sha256:e38b13f45ff16ed9fe0b40468ef005ef42b683f35635e531644b9226d8322591
mistralai-Voxtral-Mini-3B-2507-cuda-non-quantized
6.82 GB
sha256:8773a34285c236adb3013e2e47570b704e3a082418b904f7358f46e962e94c07
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-tile-packed
2.88 GB
sha256:fb934964ab2c092e0b140c18c3d76a1d3cd68f240d50fd5ff5dfd5a714698c79
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-weight-only
6.2 GB
sha256:dde5394a0d7932c877a766828a7290485b040b692e68a8d73bbfadbfbf1ad164
mistralai-Voxtral-Mini-4B-Realtime-2602-cuda-quantized-int4-tile-packed
2.71 GB
sha256:a851396a34423d3be31625d8e6adf813b647e99b9c0198c145c2c0ceb81eafd4
nvidia-diar_streaming_sortformer_4spk-v2-cuda-non-quantized
436 MB
sha256:14d83a5c525564e4d5200f8188c98ce23d8d0be842e1cacfb58a4edc2e4b541f
nvidia-parakeet-tdt-cuda-non-quantized
952 MB
sha256:4d8424edec905391333fdb59e709bcfa50759e3f19fcc216c71d35ef7e5bb8f0
nvidia-parakeet-tdt-cuda-quantized-int4-tile-packed
443 MB
sha256:84088292d53aea82c35743f8c246e6da87a364470e27211ba55b58e3ccf2eef6
openai-whisper-large-v3-turbo-cuda-quantized-int4-tile-packed
491 MB
sha256:4f2c96a6f15027099395ce52a92bbbb6d4dc2e70c585e375a484645432b447a9
openai-whisper-large-v3-turbo-cuda-quantized-int4-weight-only
570 MB
sha256:989d6a9db6d717104539b0d89b90309ab5615b9209355e785e81fcf274850bbe
openai-whisper-small-cuda-non-quantized
362 MB
sha256:04a4ca55110f9bac81b28fdd3efeaf76b0a8c771ccc2cb9af13574841bd0a5b2
unsloth-gemma-4-31B-it-GGUF-cuda-quantized-int4-tile-packed
18.8 GB
sha256:8e036a80e9dae8a16fecdb5b7aceadcd066c6f25178951e2e5bacd7b2bdb4b40