Add SVE, SVE2, SVEBITPERM, and FP16 feature detection on Windows/ARM64#55
Draft
raneashay wants to merge 1 commit intomicrosoft:mainfrom
Draft
Add SVE, SVE2, SVEBITPERM, and FP16 feature detection on Windows/ARM64#55raneashay wants to merge 1 commit intomicrosoft:mainfrom
raneashay wants to merge 1 commit intomicrosoft:mainfrom
Conversation
This patch adds support for SVE, SVE2, SVEBITPERM, FPHP, and ASIMDHP on supported ARM64 processors when running Windows. Since the MSVC compiler does not support inline assembly or ARM64 processors, this patch introduces a separate file to be able to read VL using the rdvl assembly instruction. Windows does not expose a mechanism for writing VL so this patch makes `set_and_get_current_sve_vector_length()` simply return the existing VL. This patch introduces a new test (TestSVEFeatureDetection.java) that validates the SVE level and VL determined by the CPU feature detection code, and this patch modifies two existing tests to disassociate support for SVE/SVE2 from support for FPHP and ASIMDHP. Specifically, in TestFloat16VectorOperations.java, SVE alone is insufficient to expect half-precision vector operations; instead FPHP and ASIMDHP support (which is already exercised by the test case) suffices. Along similar lines, in TestReductions.java, we should expect to see non-zero vector operations when SVE is available and we should fail on vector operations when SVE is unavailable. Finally, this patch updates TestFloat16ScalarOperations.java to check for constant-folding of FMA operations only on non-Windows platforms. We do this because `FmaDNode::Value()`, `FmaFNode::Value()`, as well as `FMAHFNode::Value()` fold FMA nodes only when `__STDC_IEC_559__` is defined, which is not the case on Windows for both GCC as well as MSVC. Perhaps the reason we hadn't discovered this discrepancy until now could be because FMA support for Windows (on ARM64) was disabled until this patch so the tests that were predicated on FPHP/ASIMDHP support never ran on Windows. Of course, this doesn't explain why we never caught this problem on Windows/x86 machines that support FMAs, but that could be because processors that support `avx512_fp16` are new and we haven't run CI on the machines.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch adds support for SVE, SVE2, SVEBITPERM, FPHP, and ASIMDHP on
supported ARM64 processors when running Windows. Since the MSVC
compiler does not support inline assembly or ARM64 processors, this
patch introduces a separate file to be able to read VL using the rdvl
assembly instruction. Windows does not expose a mechanism for writing VL
so this patch makes
set_and_get_current_sve_vector_length()simplyreturn the existing VL.
This patch introduces a new test (TestSVEFeatureDetection.java) that
validates the SVE level and VL determined by the CPU feature detection
code, and this patch modifies two existing tests to disassociate support
for SVE/SVE2 from support for FPHP and ASIMDHP. Specifically, in
TestFloat16VectorOperations.java, SVE alone is insufficient to expect
half-precision vector operations; instead FPHP and ASIMDHP support
(which is already exercised by the test case) suffices. Along similar
lines, in TestReductions.java, we should expect to see non-zero vector
operations when SVE is available and we should fail on vector operations
when SVE is unavailable.
Finally, this patch updates TestFloat16ScalarOperations.java to check
for constant-folding of FMA operations only on non-Windows platforms.
We do this because
FmaDNode::Value(),FmaFNode::Value(), as well asFMAHFNode::Value()fold FMA nodes only when__STDC_IEC_559__isdefined, which is not the case on Windows for both GCC as well as MSVC.
Perhaps the reason we hadn't discovered this discrepancy until now could
be because FMA support for Windows (on ARM64) was disabled until this
patch so the tests that were predicated on FPHP/ASIMDHP support never
ran on Windows. Of course, this doesn't explain why we never caught
this problem on Windows/x86 machines that support FMAs, but that could
be because processors that support
avx512_fp16are new and we haven'trun CI on the machines.