xe: ocl: add inline load #2661

rjoursler · 2025-02-11T16:24:21Z

Adds a new load function overload to remove the need for variable pre-declaration before loading. In addition, refactors a few reference kernels (ref_bnorm, ref_eltwise, ref_lrn) to use ocl_io.h to demonstrate the utility.

Simonsays095

Nice, I didn't think we were this close to being able to get rid of type punning in our ocl kernels. What are the remaining roadblocks for the other kernels? I know there are issues with vector operations (not yet supported on the custom types), and I think the dispatcher has some constraints?

rjoursler · 2025-02-13T21:31:51Z

What are the remaining roadblocks for the other kernels? I know there are issues with vector operations (not yet supported on the custom types), and I think the dispatcher has some constraints?

I messed around with that some as well, check out the branch rjoursle/block_load to see the results. To my knowledge, after we have add a block_load() function to ocl_io.h, it is mostly a matter of refactoring kernels to use that interface.

Adds a new load function overload to remove the need for variable pre-declaration before loading.

rjoursler · 2025-02-24T17:44:37Z

make test
disable test_device_cpu
enable test_device_gpu

rjoursler requested a review from a team as a code owner February 11, 2025 16:24

github-actions bot added the platform:gpu-intel label Feb 11, 2025

rjoursler force-pushed the rjoursle/inline_load branch 4 times, most recently from 7ca5242 to 1768064 Compare February 11, 2025 20:22

Simonsays095 approved these changes Feb 13, 2025

View reviewed changes

rjoursler mentioned this pull request Feb 14, 2025

fixes for f8 concat #2695

Merged

skazakov1 approved these changes Feb 21, 2025

View reviewed changes

kealan-barbieri approved these changes Feb 21, 2025

View reviewed changes

rjoursler force-pushed the rjoursle/inline_load branch from 1768064 to 55b11a2 Compare February 24, 2025 17:34

rjoursler added 4 commits February 24, 2025 09:38

xe: ocl: add inline load

32cbf64

Adds a new load function overload to remove the need for variable pre-declaration before loading.

xe: ocl: refactor ref_eltwise to use ocl_io.h

d4cd4ba

xe: ocl: refactor ref_bnorm to use ocl_io.h

4e36430

xe: ocl: refactor ref_lrn to use ocl_io.h

4719b1b

rjoursler force-pushed the rjoursle/inline_load branch from 55b11a2 to 4719b1b Compare February 24, 2025 17:39

rjoursler merged commit 983f734 into main Feb 24, 2025
7 of 9 checks passed

rjoursler deleted the rjoursle/inline_load branch February 24, 2025 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xe: ocl: add inline load #2661

xe: ocl: add inline load #2661

rjoursler commented Feb 11, 2025 •

edited

Loading

Simonsays095 left a comment

rjoursler commented Feb 13, 2025

rjoursler commented Feb 24, 2025

xe: ocl: add inline load #2661

xe: ocl: add inline load #2661

Conversation

rjoursler commented Feb 11, 2025 • edited Loading

Simonsays095 left a comment

Choose a reason for hiding this comment

rjoursler commented Feb 13, 2025

rjoursler commented Feb 24, 2025

rjoursler commented Feb 11, 2025 •

edited

Loading