Skip to content

use -march=native rather than -xHost for Intel oneAPI compilers >= 2025.0 #4782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

boegel
Copy link
Member

@boegel boegel commented Mar 3, 2025

fixes #4744

@boegel boegel added change EasyBuild-5.0 EasyBuild 5.0 labels Mar 3, 2025
@boegel boegel added this to the 5.0 milestone Mar 3, 2025
@boegel boegel changed the title use -march=native rather than -xHost for Intel oneAPI compilers >= 2025.0 use -march=native rather than -xHost for Intel oneAPI compilers >= 2025.0 Mar 3, 2025
@boegel boegel marked this pull request as ready for review March 3, 2025 15:14
@boegel boegel moved this to Changed default in EasyBuild v5.0 Mar 3, 2025
@boegel boegel requested a review from Micket March 3, 2025 16:49
Micket
Micket previously approved these changes Mar 3, 2025
Copy link
Contributor

@Micket Micket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been using march=native for intel-2024a, but mostly tested on zen4, where things seem fine. I guess we should build some random crap on icelake nodes and compare the two flags.

I'll ask my colleageus to help try it out.

@Micket
Copy link
Contributor

Micket commented Mar 7, 2025

I tried some miscellenous code from debian language shootout game.

fannkuchredux fortran - no difference
spectralnorm c - no difference
pidigits fortan - no difference
nbody c++ - no difference
spectralnorm fortran - 3x difference. I don't know what's happening with that one case. https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/spectralnorm-ifx-3.html

That test actually becomes better when I don't even specify -march at all (but -xHost is better)

[ohmanm@alvis2 shootout]$ ifx -O3 -ipo -qopenmp spectralnorm.ifx-3.f90  -o spectralnorm.ifx-3.ifx_run
[ohmanm@alvis2 shootout]$ time ./spectralnorm.ifx-3.ifx_run2 10000
1.274224153

real    0m0.701s
user    0m6.277s
sys     0m0.205s
[ohmanm@alvis2 shootout]$ ifx -O3 -march=native -ipo -qopenmp spectralnorm.ifx-3.f90  -o spectralnorm.ifx-3.ifx_run
[ohmanm@alvis2 shootout]$ time ./spectralnorm.ifx-3.ifx_run 10000
1.274224153

real    0m1.793s
user    0m17.510s
sys     0m0.220s

Ugh, why couldn't this just be simple.

edit: This was with intel/2024a on Intel Ice Lake + Skylake. On AMD Zen4, binaries built with -xHOST didn't work at all (it does run when built with -march=native)

@boegel
Copy link
Member Author

boegel commented Mar 8, 2025

This result is reason enough for me to put this on hold, and keep using -xHOST for now, until we learn more...

@Micket
Copy link
Contributor

Micket commented Mar 8, 2025

Just some testing some more, I think the test examples i ran from shootout aren't that great for testing this. Many of them aren't really affected by advanced instructions at all, and when they are, it seems to me that xHost wins.

Maybe the -march=... flags are just using llvms optimization passes, while xhost does the "intel specific magic"? They are definitely not doing the same code gen.
But the counter example where it fails hard is basically just some trivial dot-product loops. And -march=native there seems to be such a crappy job that it's worse than -march=generic, so something is going horrible wrong with spectralnorm example when using Fortran.

But to be clear xHost is not viable option on non-intel CPUs (the binaries wont run).
Those with mixed systems should probably consider -march=... -ax=.... as that would make a fat binary that dispatches on intel hardware.

I wonder if we could just set this conditionally? I looked for a nice clean way to detect if it's a amd system, but I found nothing useful.

@boegel boegel modified the milestones: 5.0.0, 5.x Mar 17, 2025
@boegel boegel changed the base branch from 5.0.x to develop March 19, 2025 10:28
@boegel boegel dismissed Micket’s stale review March 19, 2025 10:28

The base branch was changed.

@boegel boegel modified the milestones: 4.x, 5.x Mar 19, 2025
@boegel boegel modified the milestones: 5.x.x, 5.x Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Changed default
Development

Successfully merging this pull request may close these issues.

With new clang based intel compilers (ifx, icx, icpx) we should use -march=native
2 participants