Skip to content

Commit 2109547

Browse files
weidu-tpvisionhariharans29edgchen1
authored
MLAS: use vmlaq_f32 for ARMv7 targets (microsoft#26161)
Use vmlaq_f32 on MLAS_TARGET_ARM (armv7) so builds on Linux/Android arm32 don't attempt to use the armv8 (aarch64) vfmaq_f32 intrinsic. Fixes build failures referencing vfmaq_f32 in arm_neon.h. ### Description On ARMv7 targets (32-bit ARM with NEON) the vfmaq_f32 intrinsic is unavailable and causes build failures like: target specific option mismatch 1728 | vfmaq_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c). The previous code only fell back to vmlaq_f32 when __ANDROID__ was defined, which fixes Android but not Linux ARM32 builds. ### Motivation and Context The proposal is to change: if defined(__ANDROID__) && defined(MLAS_TARGET_ARM) to: if defined(MLAS_TARGET_ARM) This change ensures 32-bit ARM builds (Linux and Android) use vmlaq_f32 while 64-bit ARM uses vfmaq_f32. This pr is related to issue microsoft#25949 which was opened also by me. --------- Signed-off-by: Wei Du <[email protected]> Co-authored-by: Hariharan Seshadri <[email protected]> Co-authored-by: Edward Chen <[email protected]>
1 parent 4de3152 commit 2109547

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

onnxruntime/core/mlas/lib/mlasi.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2305,8 +2305,8 @@ MLAS_FLOAT32X4
23052305
MlasMultiplyAddFloat32x4(MLAS_FLOAT32X4 Vector1, MLAS_FLOAT32X4 Vector2, MLAS_FLOAT32X4 Vector3)
23062306
{
23072307
#if defined(MLAS_NEON_INTRINSICS)
2308-
#if defined(__ANDROID__) && defined(MLAS_TARGET_ARM)
2309-
// Android armeabi-v7a ABI doesn't have vfmaq_f32()
2308+
#if defined(MLAS_TARGET_ARM)
2309+
// ARMv7 NEON doesn't have vfmaq_f32()
23102310
return vmlaq_f32(Vector3, Vector1, Vector2);
23112311
#else
23122312
return vfmaq_f32(Vector3, Vector1, Vector2);
@@ -2863,4 +2863,4 @@ MlasPackInt4Elements(uint8_t* Output, UnpackedType ValueLow, UnpackedType ValueH
28632863
{
28642864
static_assert(std::is_same_v<UnpackedType, uint8_t> || std::is_same_v<UnpackedType, int8_t>);
28652865
*Output = static_cast<uint8_t>(((ValueHigh & 0xF) << 4) | (ValueLow & 0xF));
2866-
}
2866+
}

0 commit comments

Comments
 (0)