-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated CMAKE_MODULE_PATH for HIP #516
base: master
Are you sure you want to change the base?
Conversation
ROCm 6.x has cmake modules installed under ${ROCM_PATH}/lib/cmake. This fixes the cmake configure error: -- Found HIP: 6.2.41134 -- HIP PATH: /opt/COE_modules/rocm/rocm-6.2.2 CMake Error at CMakeLists.txt:380 (hip_add_library): Unknown CMake command "hip_add_library".
Hi Martin, |
My own attempts to reinstate HIP support are currently bashing up against incompatibilities in the complex datatypes, possibly related to this issue 😅 |
Ah bother! I could try to avoid the arithmetic overloads unsupported by HIP, or add additional compiler guards for HIP as necessary. I am arranging to get access to an AMD machine - I'll report back after trying your current build! |
Sounds good -- I was working on it here! |
To my dismay, I have totally failed in gaining access to a HIP-compatible AMD GPU - both DiRAC and ARC have only NVIDIA GPUs, and there's nobody in Oxford I can reach with an AMD machine! 😤 Alas I cannot test HIP compilation. If indeed these I'll fork your |
Made this PR to try compile on AMD - although it was definitely heartbreaking 😢 |
@TysonRayJones Bad news first: still doesn't compile 😢 I've attached the rather lengthy compiler error report. Looks like a mix of being unable to disambiguate the right constructor, and still finding some inline operator overloads. Might just be a case of needing to check The good news: You can access ARCHER2's AMD GPUs! If you take the ARCHER2 driving test you can get access for a year. If you want to go that route and don't have a current UK academic affiliation and want to go that route just let me know, and I can chat to the training manager to make sure your account still gets approved. I can also send you my modules and cmake to get you up and running faster! If you don't have time, I'm happy to keep jumping on and recompiling myself! |
Oh, also I made a small modification to my cmake-amd branch to prevent the annoying redefinition of |
Drat! Some of those errors I can fix, but others I need to experiment with. Getting access to ARCHER2 would great, I'll happily take the test! I believe I have "academic visitor" status at Oxford (which Simon Benjamin can corroborate), but I've no longer have access to an Oxford institutional email address (only my EPFL one, |
Have at it! Clair knows to look out for your email address 😁 GitHub (probably wisely) won't let me upload bash scripts here, but it will let me include them as code blocks. modules.sh, #!/bin/bash
module load PrgEnv-gnu
module load rocm
module load craype-accel-amd-gfx90a
module load craype-x86-milan
module load cmake cmake.sh, run this to build QuEST with for AMD GPU on ARCHER2 (or try to anyway): #!/bin/bash
VERBOSE_LIB_NAME=OFF
ENABLE_MULTITHREADING=ON
ENABLE_DISTRIBUTION=OFF
ENABLE_TESTING=OFF
ENABLE_EXAMPLES=OFF
ENABLE_HIP=ON
HIP_ARCH=gfx90a
cmake -B build -G "Unix Makefiles" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=CC \
-DCMAKE_C_COMPILER=cc \
-DVERBOSE_LIB_NAME=${VERBOSE_LIB_NAME} \
-DENABLE_TESTING=${ENABLE_TESTING} \
-DBUILD_EXAMPLES=${ENABLE_EXAMPLES} \
-DENABLE_MULTITHREADING=${ENABLE_MULTITHREADING} \
-DENABLE_DISTRIBUTION=${ENABLE_DISTRIBUTION} \
-DENABLE_HIP=${ENABLE_HIP} \
-DCMAKE_HIP_ARCHITECTURES=${HIP_ARCH} If you get as far as trying to run it, you can find example job submission scripts here! |
@mhilgema-amd I've been experimenting with your nominated change in v4 and I've become very confused by it. My testing seems to indicate that ROCM 6.3.3 places I am using Github Actions to install sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
sudo apt install python3-setuptools python3-wheel
sudo usermod -a -G render,video $USER
wget https://repo.radeon.com/amdgpu-install/6.3.3/ubuntu/noble/amdgpu-install_6.3.60303-1_all.deb
sudo apt install ./amdgpu-install_6.3.60303-1_all.deb
sudo apt update
sudo apt install amdgpu-dkms rocm This creates /opt/rocm/lib/libamdhip64.so
/opt/rocm/cmake but does not create CMAKE_MODULE_PATH = "/opt/rocm/cmake" but fails to compile ( CMAKE_MODULE_PATH = "/opt/rocm/lib/cmake/hip"
CMAKE_MODULE_PATH = "/opt/rocm/hip/lib/cmake/hip" # doesn't exist Have I missed something? My testing below indicates the cmake files in I've been testing on ARCHER2 with /opt/rocm/lib/libamdhip64.so
/opt/rocm/hip/lib/libamdhip64.so and cmake files (to which we must set
I can only ever get HIP to compile with the first and last cmake files. This seems consistent with what I'm seeing in my Github Action which does not create the Is there some overarching understanding I'm missing? Getting things working was an incredible chore; it seems like the location of the relevant files is a bit of a mess which has just cost me an entire night of sleep 😅 |
ROCm 6.x has cmake modules installed under ${ROCM_PATH}/lib/cmake. This fixes the cmake configure error: