[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #8128
Triggered via pull request
February 5, 2026 23:28
Status
Failure
Total duration
1h 20m 58s
Artifacts
14
cuda.yml
on: pull_request
Matrix: export-model-cuda-artifact
Matrix: test-cuda-builds
unittest-cuda
/
linux-job
24m 46s
Matrix: test-models-cuda
Matrix: test-model-cuda-e2e
check-all-cuda-builds
2s
Annotations
3 errors
|
test-model-cuda-e2e (openai, whisper-small, quantized-int4-weight-only) / linux-job
Process completed with exit code 1.
|
|
test-model-cuda-e2e (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job
Process completed with exit code 1.
|
|
test-model-cuda-e2e (openai, whisper-small, non-quantized) / linux-job
Process completed with exit code 1.
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
google-gemma-3-4b-it-cuda-non-quantized
|
7.22 GB |
sha256:8a1628a9c75882c9f4beff9b3da8694c21ec83a0e458948c7a86f3541129fc75
|
|
|
google-gemma-3-4b-it-cuda-quantized-int4-tile-packed
|
3.36 GB |
sha256:af7d46f664713b2625275a1c4ba3960a3c85451921ffd72456e9ea1badf20bb6
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-non-quantized
|
6.82 GB |
sha256:e2ceed7c3eafd50d490beef5d675012ff7b8acd0e27aa02b294a2bfe8c2c6df2
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-tile-packed
|
2.8 GB |
sha256:11f3701b510bc7561db79280a7420ba28b451afd4f2a8dbed6a001539b49ce92
|
|
|
mistralai-Voxtral-Mini-3B-2507-cuda-quantized-int4-weight-only
|
6.14 GB |
sha256:656895f4590a6dda336b3bb268d9f5e202f86876d817b532f525cff4bf1a1bec
|
|
|
nvidia-parakeet-tdt-cuda-non-quantized
|
952 MB |
sha256:8c8c02e1c0df5d8949100054b90eb055af16c4302d814ca73919f3b1c32cb4ed
|
|
|
nvidia-parakeet-tdt-cuda-quantized-int4-tile-packed
|
443 MB |
sha256:305d74bbcc5a4b6afca66a1f3e4b82b104d3390c5118ca255b834b5c66cd885b
|
|
|
nvidia-parakeet-tdt-cuda-quantized-int4-weight-only
|
430 MB |
sha256:3ec5bfae7234e9946ae696cd9e8768a3b76da57dc6e358cadbdd290d9e3a7768
|
|
|
openai-whisper-large-v3-turbo-cuda-non-quantized
|
1.18 GB |
sha256:ad70865ade76260072dc09f5dbe9f6d964e8ed6eec11c806bffee473a072b543
|
|
|
openai-whisper-large-v3-turbo-cuda-quantized-int4-tile-packed
|
491 MB |
sha256:6ce1a92139c340e49694bfee146f3a18cfab3d462074be496899fc4e6f52c923
|
|
|
openai-whisper-large-v3-turbo-cuda-quantized-int4-weight-only
|
485 MB |
sha256:9e1aadaaf8352ccbfb2865f76f129d643121815cf5f5cb4200f42fa2bd700603
|
|
|
openai-whisper-small-cuda-non-quantized
|
361 MB |
sha256:4e0a824cb19951ef7f2cccdc3908f1fa45ae58972c8f42846b6edafc351b415e
|
|
|
openai-whisper-small-cuda-quantized-int4-tile-packed
|
172 MB |
sha256:712df2df9806c0e6f70fbc59e9e628455de871c3bcb468ada1809e10224aa63f
|
|
|
openai-whisper-small-cuda-quantized-int4-weight-only
|
270 MB |
sha256:6e6e24f3a2ce291a463ec8f0ea8da3757e937c69d0f9c7e078c8bca4fe77be5e
|
|