-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] POC for new OCL kernels approach #26985
base: master
Are you sure you want to change the base?
[GPU] POC for new OCL kernels approach #26985
Conversation
c437546
to
9edd03c
Compare
src/plugins/intel_gpu/src/graph/impls/ocl_new/sdpa_gen_micro.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_gpu/src/graph/impls/ocl_new/primitive_ocl_base.hpp
Outdated
Show resolved
Hide resolved
b50626a
to
81014d2
Compare
fc80b87
to
d3d6913
Compare
@@ -0,0 +1,270 @@ | |||
// Copyright (C) 2023 Intel Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random spot:
It is little bit not intuitive for me that the kernels_cache & header cl kernels are under impls, while the cl kernels are under kernels_selector directory, and impls and kernels_cache are in the same hierarchy..
(also there are header cl kernels under kernels_selector directory) ..
What was the main idea of the new directory construction? Could you add some comment regarding the criteria?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea is that backend specific code should be located in impls/${backend_name}
folder. headers for kernel selector are in kernel selector still (those are duplicated for now to avoid any dependencies between components). Kernel selector shall be removed in the future, so the whole structure of ocl backend will look like ocl_v2 version
LSTMImpl() : PrimitiveImplOCL(LSTMSeqImplementationManager::get_type_info_static()) {} | ||
LSTMImpl(const program_node& node, const kernel_impl_params& params) : LSTMImpl() { | ||
add_stage(lstm_loop, params); | ||
add_stage(lstm_gemm, params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GEMM kernel should be executed before loop kernel here.
7a6f06b
to
1b3f31d
Compare
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
Signed-off-by: Vladimir Paramuzov <[email protected]>
1b3f31d
to
e868dde
Compare
Details: