Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64: conv: avoid overflows and add limit for huge spatial sizes (backport) #2628

Merged
merged 1 commit into from
Feb 8, 2025

Conversation

tczeszun
Copy link
Contributor

@tczeszun tczeszun commented Feb 7, 2025

backport of #2627

(base) gta@DUT7314PVC:~/tczeszun$ LD_LIBRARY_PATH=. DNNL_VERBOSE=2 ./benchdnn  --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit -v5 mb1ic32iw134217732oc1ow134217730kw3pw0
create: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
onednn_verbose,v1,info,oneDNN v3.7.0 (commit 10df79807cf134982920bbbdb02a0dab666d4ed3)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:208
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with float16, Intel DL Boost and bfloat16 support
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:2
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) Data Center GPU Max 1550,driver_version:24.26.30049,binary_kernels:enabled
onednn_verbose,v1,info,gpu,engine,1,name:Intel(R) Data Center GPU Max 1550,driver_version:24.26.30049,binary_kernels:enabled
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,create:cache_miss,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,248.712
oneDNN implementation: jit:ir
[IMPL_FILTER] All implementations were skipped!
[IMPL_FILTER] All implementations were skipped!
onednn_verbose,v1,primitive,create:persistent_cache_hit,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,24.6731
run: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,jit:uni,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x1x134217730,0.334961
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x1x134217730,40.6531
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,simple:any,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x32x134217732,0.106934
onednn_verbose,v1,primitive,exec,cpu,reorder,simple:any,undef,src:f32::blocked:abc::f0 dst:bf16::blocked:acb::f0,,,1x32x134217732,16760.3
onednn_verbose,v1,primitive,exec,gpu:0,convolution,jit:ir,backward_weights,src:bf16::blocked:acb::f0 wei:bf16::blocked:cba::f0 bia:undef::undef::: dst:bf16::blocked:acb::f0,,alg:convolution_direct,mb1_ic32oc1_iw134217732ow134217730kw3sw1dw0pw0,37.7659
onednn_verbose,v1,primitive,create:cache_miss,cpu,reorder,jit:uni,undef,src:bf16::blocked:cba::f0 dst:f32::blocked:abc::f0,,,1x32x3,0.313965
onednn_verbose,v1,primitive,exec,cpu,reorder,jit:uni,undef,src:bf16::blocked:cba::f0 dst:f32::blocked:abc::f0,,,1x32x3,0.019043
0:PASSED __REPRO: --conv --engine=gpu --dir=BWD_W --dt=bf16:bf16:bf16 --stag=axb --wtag=xba --dtag=axb --impl=jit mb1ic32iw134217732oc1ow134217730kw3pw0
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 128.80s; fill: 36.24s (28%); compute_ref: 86.90s (67%); compare: 0.01s (0%);

@tczeszun tczeszun requested a review from a team as a code owner February 7, 2025 21:04
@github-actions github-actions bot added platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 backport labels Feb 7, 2025
@densamoilov
Copy link
Contributor

I ran ci and nightly test set for this PR, the failures in conv look unrelated to this particular issue.

@densamoilov densamoilov merged commit a68b8ae into rls-v3.7 Feb 8, 2025
12 checks passed
@densamoilov densamoilov deleted the tczeszun/fix_conv_int_overflows_rls-v3.7 branch February 8, 2025 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants