Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please support AVX512_FP16 #2822

Open
Elijah-777 opened this issue Mar 5, 2025 · 5 comments
Open

Please support AVX512_FP16 #2822

Elijah-777 opened this issue Mar 5, 2025 · 5 comments
Assignees
Labels
enhancement A feature or an optimization request help wanted

Comments

@Elijah-777
Copy link

Chips supporting AVX512_FP16 have been released for more than a year. Why does Intel's open source GPU computing sub-computing library still not support AVX512_FP16? AVX512_FP16 is the instruction I expect to use

@Elijah-777 Elijah-777 added the enhancement A feature or an optimization request label Mar 5, 2025
@vpirogov
Copy link
Contributor

vpirogov commented Mar 6, 2025

oneDNN uses instructions from AVX512_FP16 ISA extension on processors with Intel AVX 10.1/512 instruction set support (4th and 5th generation Intel Xeon Scalable Processors and Intel Xeon 6 processors).

Default numerical behavior for oneDNN functions requires fp32 accumulation, which is not supported by FMA instructions in AVX512_FP16 extension. This implementation can be added in relaxed accumulation mode, but it's not a priority for the core engineering team at the moment.

@vpirogov vpirogov self-assigned this Mar 6, 2025
@shu1chen
Copy link
Contributor

shu1chen commented Mar 7, 2025

Hello @DaiShaoJie77, the configuration of oneDNN functions might prevent the test from utilizing the AVX512_FP16 ISA in some use cases. For more details of the issue, could you please provide the oneDNN verbose log by setting ONEDNN_VERBOSE=dispatch in the test environment?

@Elijah-777 Elijah-777 reopened this Mar 7, 2025
@Elijah-777
Copy link
Author

Elijah-777 commented Mar 7, 2025

oneDNN uses instructions from AVX512_FP16 ISA extension on processors with Intel AVX 10.1/512 instruction set support (4th and 5th generation Intel Xeon Scalable Processors and Intel Xeon 6 processors).

Default numerical behavior for oneDNN functions requires fp32 accumulation, which is not supported by FMA instructions in AVX512_FP16 extension. This implementation can be added in relaxed accumulation mode, but it's not a priority for the core engineering team at the moment.

Hi, I don't understand what you mean. Do you mean to add an option somewhere to enable fp16? Which file and where is it?@vpirogov

@Elijah-777
Copy link
Author

Hello @DaiShaoJie77, the configuration of oneDNN functions might prevent the test from utilizing the AVX512_FP16 ISA in some use cases. For more details of the issue, could you please provide the oneDNN verbose log by setting ONEDNN_VERBOSE=dispatch in the test environment?

I tried to set this parameter in the environment variable, but it just printed some data types and did not use the instruction set I wanted.@shu1chen

@shu1chen
Copy link
Contributor

shu1chen commented Mar 7, 2025

I tried to set this parameter in the environment variable, but it just printed some data types and did not use the instruction set I wanted.

Please send us the oneDNN verbose log so we can identify the exact issue you're experiencing.

Since we haven't received the verbose log, we're unable to determine how you're using oneDNN. For example, if your input data to oneDNN is FP32 and you wish to use AVX512_FP16 ISA, you'll need to set the fpmath_mode to f16 and the accumulation mode to relaxed in dnnl::primitive_attr. For example:

    dnnl::primitive_attr attr;
    attr.set_fpmath_mode(dnnl::fpmath_mode::f16, true);
    attr.set_accumulation_mode(dnnl::accumulation_mode::relaxed);

You can find the documentation for Primitive Attributes in the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A feature or an optimization request help wanted
Projects
None yet
Development

No branches or pull requests

3 participants