Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xe: ocl: add inline load #2661

Merged
merged 4 commits into from
Feb 24, 2025
Merged

xe: ocl: add inline load #2661

merged 4 commits into from
Feb 24, 2025

Conversation

rjoursler
Copy link
Contributor

@rjoursler rjoursler commented Feb 11, 2025

Adds a new load function overload to remove the need for variable pre-declaration before loading. In addition, refactors a few reference kernels (ref_bnorm, ref_eltwise, ref_lrn) to use ocl_io.h to demonstrate the utility.

@rjoursler rjoursler requested a review from a team as a code owner February 11, 2025 16:24
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Feb 11, 2025
@rjoursler rjoursler force-pushed the rjoursle/inline_load branch 4 times, most recently from 7ca5242 to 1768064 Compare February 11, 2025 20:22
Copy link
Contributor

@Simonsays095 Simonsays095 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I didn't think we were this close to being able to get rid of type punning in our ocl kernels. What are the remaining roadblocks for the other kernels? I know there are issues with vector operations (not yet supported on the custom types), and I think the dispatcher has some constraints?

@rjoursler
Copy link
Contributor Author

What are the remaining roadblocks for the other kernels? I know there are issues with vector operations (not yet supported on the custom types), and I think the dispatcher has some constraints?

I messed around with that some as well, check out the branch rjoursle/block_load to see the results. To my knowledge, after we have add a block_load() function to ocl_io.h, it is mostly a matter of refactoring kernels to use that interface.

@rjoursler rjoursler mentioned this pull request Feb 14, 2025
@rjoursler rjoursler force-pushed the rjoursle/inline_load branch from 1768064 to 55b11a2 Compare February 24, 2025 17:34
@rjoursler rjoursler force-pushed the rjoursle/inline_load branch from 55b11a2 to 4719b1b Compare February 24, 2025 17:39
@rjoursler
Copy link
Contributor Author

make test
disable test_device_cpu
enable test_device_gpu

@rjoursler rjoursler merged commit 983f734 into main Feb 24, 2025
7 of 9 checks passed
@rjoursler rjoursler deleted the rjoursle/inline_load branch February 24, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants