x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

tczeszun · 2025-02-07T21:04:23Z

backport of #2627

(base) gta@DUT7314PVC:~/tczeszun$ LD_LIBRARY_PATH=. DNNL_VERBOSE=2 ./benchdnn  --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit -v5 mb1ic32iw134217732oc1ow134217730kw3pw0
create: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
onednn_verbose,v1,info,oneDNN v3.7.0 (commit 10df79807cf134982920bbbdb02a0dab666d4ed3)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:208
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with float16, Intel DL Boost and bfloat16 support
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:2
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) Data Center GPU Max 1550,driver_version:24.26.30049,binary_kernels:enabled
onednn_verbose,v1,info,gpu,engine,1,name:Intel(R) Data Center GPU Max 1550,driver_version:24.26.30049,binary_kernels:enabled
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,create:cache_miss,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,248.712
oneDNN implementation: jit:ir
[IMPL_FILTER] All implementations were skipped!
[IMPL_FILTER] All implementations were skipped!
onednn_verbose,v1,primitive,create:persistent_cache_hit,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,24.6731
run: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,jit:uni,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x1x134217730,0.334961
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x1x134217730,40.6531
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,simple:any,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x32x134217732,0.106934
onednn_verbose,v1,primitive,exec,cpu,reorder,simple:any,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x32x134217732,16760.3
onednn_verbose,v1,primitive,exec,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,37.7659
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,jit:uni,undef,src:bf16::blocked:cba::f0 dst:f32::blocked:abc::f0,,,1x32x3,0.313965
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:bf16::blocked:cba::f0 dst:f32::blocked:abc::f0,,,1x32x3,0.019043
0:PASSED __REPRO: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 128.80s; fill: 36.24s (28%); compute_ref: 86.90s (67%); compare: 0.01s (0%);

densamoilov · 2025-02-08T01:43:30Z

I ran ci and nightly test set for this PR, the failures in conv look unrelated to this particular issue.

x64: conv: avoid overflows and add limit for huge spatial sizes

10df798

tczeszun requested a review from a team as a code owner February 7, 2025 21:04

github-actions bot added platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 backport labels Feb 7, 2025

densamoilov approved these changes Feb 7, 2025

View reviewed changes

xuxinzen approved these changes Feb 7, 2025

View reviewed changes

densamoilov merged commit a68b8ae into rls-v3.7 Feb 8, 2025
12 checks passed

densamoilov deleted the tczeszun/fix_conv_int_overflows_rls-v3.7 branch February 8, 2025 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

tczeszun commented Feb 7, 2025

densamoilov commented Feb 8, 2025

x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

Conversation

tczeszun commented Feb 7, 2025

densamoilov commented Feb 8, 2025