Skip to content

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #8128

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #8128

Triggered via pull request February 5, 2026 23:28
Status Failure
Total duration 1h 20m 58s
Artifacts 14

cuda.yml

on: pull_request
Matrix: export-model-cuda-artifact
Matrix: test-cuda-builds
unittest-cuda  /  linux-job
24m 46s
unittest-cuda / linux-job
Matrix: test-models-cuda
Matrix: test-model-cuda-e2e
check-all-cuda-builds
2s
check-all-cuda-builds
Fit to window
Zoom out
Zoom in

Annotations

3 errors
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.

Artifacts

Produced during runtime
Name Size Digest
google-gemma-3-4b-it-cuda-non-quantized
7.22 GB
sha256:8a1628a9c75882c9f4beff9b3da8694c21ec83a0e458948c7a86f3541129fc75
google-gemma-3-4b-it-cuda-quantized-int4-tile-packed
3.36 GB
sha256:af7d46f664713b2625275a1c4ba3960a3c85451921ffd72456e9ea1badf20bb6
mistralai-Voxtral-Mini-3B-2507-cuda-non-quantized
6.82 GB
sha256:e2ceed7c3eafd50d490beef5d675012ff7b8acd0e27aa02b294a2bfe8c2c6df2
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-tile-packed
2.8 GB
sha256:11f3701b510bc7561db79280a7420ba28b451afd4f2a8dbed6a001539b49ce92
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-weight-only
6.14 GB
sha256:656895f4590a6dda336b3bb268d9f5e202f86876d817b532f525cff4bf1a1bec
nvidia-parakeet-tdt-cuda-non-quantized
952 MB
sha256:8c8c02e1c0df5d8949100054b90eb055af16c4302d814ca73919f3b1c32cb4ed
nvidia-parakeet-tdt-cuda-quantized-int4-tile-packed
443 MB
sha256:305d74bbcc5a4b6afca66a1f3e4b82b104d3390c5118ca255b834b5c66cd885b
nvidia-parakeet-tdt-cuda-quantized-int4-weight-only
430 MB
sha256:3ec5bfae7234e9946ae696cd9e8768a3b76da57dc6e358cadbdd290d9e3a7768
openai-whisper-large-v3-turbo-cuda-non-quantized
1.18 GB
sha256:ad70865ade76260072dc09f5dbe9f6d964e8ed6eec11c806bffee473a072b543
openai-whisper-large-v3-turbo-cuda-quantized-int4-tile-packed
491 MB
sha256:6ce1a92139c340e49694bfee146f3a18cfab3d462074be496899fc4e6f52c923
openai-whisper-large-v3-turbo-cuda-quantized-int4-weight-only
485 MB
sha256:9e1aadaaf8352ccbfb2865f76f129d643121815cf5f5cb4200f42fa2bd700603
openai-whisper-small-cuda-non-quantized
361 MB
sha256:4e0a824cb19951ef7f2cccdc3908f1fa45ae58972c8f42846b6edafc351b415e
openai-whisper-small-cuda-quantized-int4-tile-packed
172 MB
sha256:712df2df9806c0e6f70fbc59e9e628455de871c3bcb468ada1809e10224aa63f
openai-whisper-small-cuda-quantized-int4-weight-only
270 MB
sha256:6e6e24f3a2ce291a463ec8f0ea8da3757e937c69d0f9c7e078c8bca4fe77be5e