Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Matrix] Add support for tf32 type using the unified interface #8702

Merged
merged 12 commits into from
Apr 6, 2023

Conversation

dkhaldi
Copy link
Contributor

@dkhaldi dkhaldi commented Mar 20, 2023

No description provided.

@dkhaldi dkhaldi marked this pull request as ready for review March 20, 2023 18:19
@dkhaldi dkhaldi requested a review from a team as a code owner March 20, 2023 18:19
@dkhaldi dkhaldi requested a review from bso-intel March 20, 2023 18:19
auto wi_data_a =
sycl::ext::intel::experimental::matrix::get_wi_data(sg, sub_a);
for (int i = 0; i < wi_data_a.length(); i++) {
wi_data_a[i] = round_to_tf32(wi_data_a[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but shouldn't this type of usage of round_to_tf32 really use joint_matrix_apply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will use joint_matrix_apply for A and leave the loop way for B to show both usages.

@JackAKirk
Copy link
Contributor

I think this should be fine from NVPTX point of view: will wait for confirmation from the cuda llvm-test-suite run.

@@ -0,0 +1,164 @@
// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4 %s -o %t.out
#include <iostream>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this kind of runtime test be placed in llvm-test-suite/SYCL/Matrix instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the header to only check compilation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use -emit-llvm and FileCheck shouldn't we add checks for the actual IR generated by the compiler?

@dkhaldi dkhaldi temporarily deployed to aws March 20, 2023 20:50 — with GitHub Actions Inactive
Copy link
Contributor

@MrSidims MrSidims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are LGTM

@@ -0,0 +1,164 @@
// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4 %s -o %t.out
#include <iostream>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use -emit-llvm and FileCheck shouldn't we add checks for the actual IR generated by the compiler?

joint_matrix_apply(sg, sub_a, [=](float x) { x *= 2; });
for (int i = 0; i < wi_slice_a.length(); i++) {
float elem = wi_slice_a[i];
wi_slice_a[i] *= 2;
Copy link
Contributor Author

@dkhaldi dkhaldi Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this ewo line to ewo test in llvm-test-suite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are added here: intel/llvm-test-suite#1677

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Mar 22, 2023

@MrSidims, I am changing the name to __spirv_RoundFToTF32INTEL
Is this what you will be using?

@dkhaldi dkhaldi temporarily deployed to aws March 22, 2023 19:25 — with GitHub Actions Inactive
@dkhaldi dkhaldi temporarily deployed to aws March 23, 2023 18:12 — with GitHub Actions Inactive
Copy link
Contributor

@yubingex007-a11y yubingex007-a11y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dkhaldi dkhaldi temporarily deployed to aws March 24, 2023 14:12 — with GitHub Actions Inactive
@dkhaldi dkhaldi temporarily deployed to aws March 28, 2023 13:13 — with GitHub Actions Inactive
@yubingex007-a11y yubingex007-a11y self-assigned this Apr 3, 2023
@yubingex007-a11y
Copy link
Contributor

e2e testcases got approval in intel/llvm-test-suite#1677

@yubingex007-a11y
Copy link
Contributor

ping?

@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 3, 2023 12:20 — with GitHub Actions Inactive
@dkhaldi
Copy link
Contributor Author

dkhaldi commented Apr 3, 2023

@yubingex007-a11y the build is failing here because the round function is not available yet in intel/llvm.
Will #8846 solve the build issue?

@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 4, 2023 06:51 — with GitHub Actions Inactive
@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 4, 2023 07:24 — with GitHub Actions Inactive
@yubingex007-a11y
Copy link
Contributor

the testcases fail are not related with this pr i think.

@yubingex007-a11y
Copy link
Contributor

ping?

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small naming change request, but otherwise LGTM.

template <typename T> struct helper_traits {
using element_type = T;
using storage_element_type = T;
using fill_argument_type = T;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove fill_argument_type as it is currently unused

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@JackAKirk JackAKirk self-requested a review April 5, 2023 13:45
@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 5, 2023 14:08 — with GitHub Actions Inactive
@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 5, 2023 14:40 — with GitHub Actions Inactive
@yubingex007-a11y
Copy link
Contributor

@steffenlarsen Hi, can you help us merge this patch

@steffenlarsen
Copy link
Contributor

@yubingex007-a11y - Could you please push another merge commit to rerun pre-commit tests? CI issues should hopefully have been addressed.

@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 6, 2023 10:39 — with GitHub Actions Inactive
@yubingex007-a11y yubingex007-a11y temporarily deployed to aws April 6, 2023 11:11 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen merged commit aba6d85 into intel:sycl Apr 6, 2023
//===----------------------------------------------------------------------===//
// REQUIRES: matrix

// RUN: %clangxx -fsycl %s -o %t.out -Dsycl/test-e2e_EXT_ONEAPI_MATRIX_VERSION=4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4

//===----------------------------------------------------------------------===//
// REQUIRES: matrix

// RUN: %clangxx -fsycl %s -o %t.out -Dsycl/test-e2e_EXT_ONEAPI_MATRIX_VERSION=4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4

auto wi_data_b =
sycl::ext::intel::experimental::matrix::get_wi_data(sg, sub_b);
for (int i = 0; i < wi_data_b.length(); i++) {
wi_data_b[i] = round_to_tf32(wi_data_b[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No known conversion from wi_element<> to float &

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants