Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xe: jit: gemm: remove unnecessary register duplication #2625

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

rjoursler
Copy link
Contributor

@rjoursler rjoursler commented Feb 7, 2025

Fixes an out of registers failure on XeLP systems for the following workload:

benchdnn --matmul --engine=gpu --dt=bf16:s4:bf16 --wtag=acb --attr-scales=wei:per_ocic:bf16:32x1 --attr-zero-points=wei:per_ocic:s4:32x1 --attr-fpmath=bf16:true 3x96x96:3x96x64

On this workload, a duplicated scalar is the only subregister allocation present in r1. Removing this duplication thereby frees a register and avoids kernel generation failure.

Fixes MFDNN-13167.

@rjoursler rjoursler requested a review from a team as a code owner February 7, 2025 14:53
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Feb 7, 2025
Fixes an out of registers failure on XeLP systems for the following workload:

```
benchdnn --matmul --engine=gpu --dt=bf16:s4:bf16 --wtag=acb
--attr-scales=wei:per_ocic:bf16:32x1 --attr-zero-points=wei:per_ocic:s4:32x1
--attr-fpmath=bf16:true 3x96x96:3x96x64
```

On this workload, a duplicated scalar is the only subregister allocation present
in r1.
@rjoursler rjoursler force-pushed the rjoursle/fix_out_of_registers branch from cf7072f to c62ec58 Compare February 7, 2025 15:00
@rjoursler
Copy link
Contributor Author

make test
disable test_device_cpu
enable test_device_gpu

@rjoursler rjoursler merged commit ba83c47 into main Feb 7, 2025
8 of 10 checks passed
@rjoursler rjoursler deleted the rjoursle/fix_out_of_registers branch February 7, 2025 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants