Skip to content

Conversation

@fwyzard
Copy link
Contributor

@fwyzard fwyzard commented May 10, 2025

MPICH is not selected by default, as it would conflict with Open MPI.

Fix the Open MPI version to 4.1.9a1.

Add a generic MPI tool, using by default Open MPI. The CMSSW packages can declare use name="mpi" to use the generic tool. Updating the tool to use MPICH instead of Open MPI automatically updates all packages.

@fwyzard
Copy link
Contributor Author

fwyzard commented May 10, 2025

please test

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard for branch IB/CMSSW_15_1_X/master.

@iarspider, @smuzaffar can you please review it and eventually sign? Thanks.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented May 10, 2025

cms-bot internal usage

@fwyzard
Copy link
Contributor Author

fwyzard commented May 10, 2025

The idea is that alternative MPI implementations, like Open MPI 5.x or MPICH, can be added in parallel to Open MPI 4.x as "available" tools.

Then changing MPI implementation in CMSSW requires updating only the mpi.xml tool and rebuilding the affected packages.

It is a known issues that this does not address the externals.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46020/summary.html
COMMIT: aa35319
CMSSW: CMSSW_15_1_X_2025-05-09-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9845/46020/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46020/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46020/git-merge-result

Comparison Summary

Summary:

  • You potentially added 17 lines to the logs
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4038131
  • DQMHistoTests: Total failures: 44
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4038067
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 215 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

Pull request #9845 was updated.

1 similar comment
@cmsbuild
Copy link
Contributor

Pull request #9845 was updated.

@fwyzard fwyzard changed the title Add a generic MPI tool, with default to Open MPI 4.1 Add MPICH v4.3 and a generic MPI tool May 11, 2025
@fwyzard
Copy link
Contributor Author

fwyzard commented May 11, 2025

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46022/summary.html
COMMIT: f9c2aa8
CMSSW: CMSSW_15_1_X_2025-05-11-0000/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9845/46022/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

>> Leaving Package Utilities/RelMon
>> Package Utilities/RelMon built
Copying tmp/el8_amd64_gcc12/src/DataFormats/SoATemplate/test/testRocmSoALayoutAndView_t/libtestRocmSoALayoutAndView_t_rocm.a to productstore area:
cp: cannot stat 'tmp/el8_amd64_gcc12/src/DataFormats/SoATemplate/test/testRocmSoALayoutAndView_t/libtestRocmSoALayoutAndView_t_rocm.a': No such file or directory
>> Deleted: tmp/el8_amd64_gcc12/src/DataFormats/SoATemplate/test/testRocmSoALayoutAndView_t/libtestRocmSoALayoutAndView_t_rocm.a
gmake: *** [config/SCRAM/GMake/Makefile.rules:1886: tmp/el8_amd64_gcc12/src/DataFormats/SoATemplate/test/testRocmSoALayoutAndView_t/libtestRocmSoALayoutAndView_t_rocm.a] Error 1
Copying tmp/el8_amd64_gcc12/src/HeterogeneousTest/CUDAWrapper/src/HeterogeneousTestCUDAWrapper/libHeterogeneousTestCUDAWrapper_nv.a to productstore area:
Copying tmp/el8_amd64_gcc12/src/HeterogeneousTest/ROCmWrapper/src/HeterogeneousTestROCmWrapper/libHeterogeneousTestROCmWrapper_rocm.a to productstore area:
cp: cannot stat 'tmp/el8_amd64_gcc12/src/HeterogeneousTest/ROCmWrapper/src/HeterogeneousTestROCmWrapper/libHeterogeneousTestROCmWrapper_rocm.a': No such file or directory
>> Deleted: tmp/el8_amd64_gcc12/src/HeterogeneousTest/ROCmWrapper/src/HeterogeneousTestROCmWrapper/libHeterogeneousTestROCmWrapper_rocm.a
gmake: *** [config/SCRAM/GMake/Makefile.rules:1886: tmp/el8_amd64_gcc12/src/HeterogeneousTest/ROCmWrapper/src/HeterogeneousTestROCmWrapper/libHeterogeneousTestROCmWrapper_rocm.a] Error 1


@fwyzard
Copy link
Contributor Author

fwyzard commented May 12, 2025

please test

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46027/summary.html
COMMIT: f9c2aa8
CMSSW: CMSSW_15_1_X_2025-05-11-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9845/46027/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation warning when building: See details on the summary page.

@fwyzard fwyzard force-pushed the IB/CMSSW_15_1_X/master_generic_mpi branch from f9c2aa8 to d901128 Compare May 12, 2025 05:44
@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46194/summary.html
COMMIT: 12a2ec5
CMSSW: CMSSW_15_1_X_2025-05-15-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9845/46194/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4038193
  • DQMHistoTests: Total failures: 24
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4038149
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 215 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor Author

fwyzard commented May 20, 2025

@smuzaffar is this OK to be merged ?

@smuzaffar
Copy link
Contributor

please test for el9_amd64_gcc12

@smuzaffar
Copy link
Contributor

please test for el8_amd64_gcc14

@smuzaffar
Copy link
Contributor

please test for el8_aarch64_gcc12

@smuzaffar
Copy link
Contributor

@smuzaffar is this OK to be merged ?

@fwyzard , over all looks good my I see that following files are common in openmpi and mpich

  • shared libs: libmpi.so
  • headers: mpif.h mpi.h

so how you foresee both tools to be available at the same time in cmssw ? I know currently mpich tool is disabled so merging this PR should be fine but what are future plans or these? With existing distribution we can not have both tools available in cmssw specially if we link or load libmpi.so at runtime or f something include one of these common header file

@fwyzard
Copy link
Contributor Author

fwyzard commented May 21, 2025

The plan is to make it possible to test with either Open MPI or MPICH, disabling one and enabling the other, and updating the mpi.xml tool accordingly.

@smuzaffar
Copy link
Contributor

The plan is to make it possible to test with either Open MPI or MPICH, disabling one and enabling

If we want to do this at cmssw dev area level i.e. developer can update the mpi.xml to select on or other then we should add <flags SKIP_TOOL_SYMLINKS="1"/> in mpich.xml and openmpi.xml toolfiles so that scram does not create symlinks under cmssw/external/arch/lib for libmpi.so

@fwyzard
Copy link
Contributor Author

fwyzard commented May 21, 2025

OK, I can do that.
But then, how will the CMSSW modules find libmpi.so ?

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46289/summary.html
COMMIT: 12a2ec5
CMSSW: CMSSW_15_1_X_2025-05-20-2300/el8_aarch64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9845/46289/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

[1614/1741] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/bin/nvcc -forward-unknown-to-host-compiler -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS -DCAFFE2_PERF_WITH_SVE=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPROTOBUF_USE_DLLS -DTORCH_CUDA_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_FLASH_ATTENTION -DUSE_MEM_EFF_ATTENTION -D_FILE_OFFSET_BITS=64 -Dtorch_cuda_EXPORTS -DTORCH_ASSERT_NO_OPERATORS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/nlohmann -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/THC -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/cuda -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/caffe2/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/cuda/../.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/OpenBLAS/0.3.27-70a9dd2c9f309171934f13e3003b0540/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/eigen/3bb6a48d8c171cf20b5f8e48bfb4e424fbd4f79e-5d91c922e771c0dc4f6bc00f61f3e2c5/include/eigen3 -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/INTERFACE -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/nlohmann/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -isystem /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02890/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cudnn/9.9.0.52-3661a1e101a749b5e2758400a2e4b202/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/cmake/../third_party/cudnn_frontend/include -DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS -D_GLIBCXX_USE_CXX11_ABI=1 -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda  -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUB_WRAPPED_NAMESPACE=at_cuda_detail -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__NEON__ -Xcompiler -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -MD -MT caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_bf16_sm80.cu.o -MF caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_bf16_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_bf16_sm80.cu -o caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_bf16_sm80.cu.o
[1615/1741] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/bin/nvcc -forward-unknown-to-host-compiler -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS -DCAFFE2_PERF_WITH_SVE=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPROTOBUF_USE_DLLS -DTORCH_CUDA_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_FLASH_ATTENTION -DUSE_MEM_EFF_ATTENTION -D_FILE_OFFSET_BITS=64 -Dtorch_cuda_EXPORTS -DTORCH_ASSERT_NO_OPERATORS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/nlohmann -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/THC -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/cuda -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/caffe2/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/cuda/../.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/OpenBLAS/0.3.27-70a9dd2c9f309171934f13e3003b0540/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/eigen/3bb6a48d8c171cf20b5f8e48bfb4e424fbd4f79e-5d91c922e771c0dc4f6bc00f61f3e2c5/include/eigen3 -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/INTERFACE -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/nlohmann/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -isystem /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02890/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cudnn/9.9.0.52-3661a1e101a749b5e2758400a2e4b202/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/cmake/../third_party/cudnn_frontend/include -DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS -D_GLIBCXX_USE_CXX11_ABI=1 -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda  -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUB_WRAPPED_NAMESPACE=at_cuda_detail -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__NEON__ -Xcompiler -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -MD -MT caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_bf16_sm80.cu.o -MF caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_bf16_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_bf16_sm80.cu -o caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_bf16_sm80.cu.o
[1616/1741] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/bin/nvcc -forward-unknown-to-host-compiler -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS -DCAFFE2_PERF_WITH_SVE=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPROTOBUF_USE_DLLS -DTORCH_CUDA_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_FLASH_ATTENTION -DUSE_MEM_EFF_ATTENTION -D_FILE_OFFSET_BITS=64 -Dtorch_cuda_EXPORTS -DTORCH_ASSERT_NO_OPERATORS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/nlohmann -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/THC -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/cuda -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/caffe2/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/cuda/../.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/OpenBLAS/0.3.27-70a9dd2c9f309171934f13e3003b0540/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/eigen/3bb6a48d8c171cf20b5f8e48bfb4e424fbd4f79e-5d91c922e771c0dc4f6bc00f61f3e2c5/include/eigen3 -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/INTERFACE -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/nlohmann/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -isystem /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02890/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cudnn/9.9.0.52-3661a1e101a749b5e2758400a2e4b202/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/cmake/../third_party/cudnn_frontend/include -DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS -D_GLIBCXX_USE_CXX11_ABI=1 -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda  -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUB_WRAPPED_NAMESPACE=at_cuda_detail -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__NEON__ -Xcompiler -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -MD -MT caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_fp16_sm80.cu.o -MF caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_fp16_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_fp16_sm80.cu -o caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_fp16_sm80.cu.o
[1617/1741] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/bin/nvcc -forward-unknown-to-host-compiler -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS -DCAFFE2_PERF_WITH_SVE=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPROTOBUF_USE_DLLS -DTORCH_CUDA_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_FLASH_ATTENTION -DUSE_MEM_EFF_ATTENTION -D_FILE_OFFSET_BITS=64 -Dtorch_cuda_EXPORTS -DTORCH_ASSERT_NO_OPERATORS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/third_party/onnx -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/nlohmann -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/THC -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/cuda -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/../../../third_party/cutlass/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/build/caffe2/aten/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/cuda/../.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/c10/.. -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/torch/csrc/api/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/OpenBLAS/0.3.27-70a9dd2c9f309171934f13e3003b0540/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/eigen/3bb6a48d8c171cf20b5f8e48bfb4e424fbd4f79e-5d91c922e771c0dc4f6bc00f61f3e2c5/include/eigen3 -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/INTERFACE -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/third_party/nlohmann/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -isystem /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02890/el8_aarch64_gcc12/external/cuda/12.9.0-cd4613335deff6e5576167f7dbef4cb0/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc12/external/cudnn/9.9.0.52-3661a1e101a749b5e2758400a2e4b202/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/cmake/../third_party/cudnn_frontend/include -DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS -D_GLIBCXX_USE_CXX11_ABI=1 -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda  -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUB_WRAPPED_NAMESPACE=at_cuda_detail -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__NEON__ -Xcompiler -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -MD -MT caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_fp16_sm80.cu.o -MF caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_fp16_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc12/external/pytorch/2.6.0-2d439982640966c9ad6d416aff8e1d8c/pytorch-2.6.0/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_fp16_sm80.cu -o caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_fp16_sm80.cu.o
ninja: build stopped: subcommand failed.
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.HeC2sl (%build)


RPM build errors:
Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.HeC2sl (%build)



@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46287/summary.html
COMMIT: 12a2ec5
CMSSW: CMSSW_15_1_X_2025-05-20-2300/el9_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9845/46287/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46287/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46287/git-merge-result

Unit Tests

I found 1 errors in the following unit tests:

---> test testFWCoreConcurrency had ERRORS

Comparison Summary

Summary:

  • You potentially added 352 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 78927 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4038193
  • DQMHistoTests: Total failures: 541922
  • DQMHistoTests: Total nulls: 433
  • DQMHistoTests: Total successes: 3495818
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -4.628 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): -0.117 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 13034.0 ): 1.245 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 141.042,... ): -0.012 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.104 ): -0.004 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.5 ): -0.023 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.604 ): 0.055 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 17034.0 ): -1.588 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 250202.181 ): 0.426 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 25202.0 ): 0.904 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 7.3 ): -5.502 KiB SiStrip/MechanicalView
  • Checked 215 log files, 184 edm output root files, 50 DQM output files

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46288/summary.html
COMMIT: 12a2ec5
CMSSW: CMSSW_15_1_X_2025-05-20-1100/el8_amd64_gcc14
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9845/46288/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46288/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46288/git-merge-result

Comparison Summary

Summary:

  • You potentially added 1050 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 100557 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4038193
  • DQMHistoTests: Total failures: 486208
  • DQMHistoTests: Total nulls: 451
  • DQMHistoTests: Total successes: 3551514
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 7.104 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): -0.054 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 13034.0 ): -0.596 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 140.045,... ): -0.004 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 141.042 ): 0.043 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.014 ): 0.004 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.408 ): -0.016 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.5 ): 0.008 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.604 ): 0.090 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 145.713 ): -0.008 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 17034.0 ): 3.286 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 250202.181 ): ...
  • Checked 215 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: found differences in 22 / 48 workflows

@smuzaffar
Copy link
Contributor

smuzaffar commented May 22, 2025

OK, I can do that. But then, how will the CMSSW modules find libmpi.so ?

scram will add -L${LIBDIR} of the tool (the one which is selected via mpi tool) at link time. We can add a runtime hook to add $MPICH_ROOT/lib or $OPENMPI_ROOT/lib (depending on which tool is selected via mpi.xml) in to LD_LIBRARY_PATH

@smuzaffar
Copy link
Contributor

please test for el8_aarch64_gcc12

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-51b441/46341/summary.html
COMMIT: 12a2ec5
CMSSW: CMSSW_15_1_X_2025-05-22-2300/el8_aarch64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9845/46341/install.sh to create a dev area with all the needed externals and cmssw changes.

@smuzaffar
Copy link
Contributor

@fwyzard , this is ready to go in. I can update the toolfile and build rules for #9845 (comment) and #9845 (comment) in a separate PR.

Do you want to run any local tests before we merge it?

@fwyzard
Copy link
Contributor Author

fwyzard commented May 26, 2025

I can update the toolfile and build rules for #9845 (comment) and #9845 (comment) in a separate PR.

Ah, then yes, let's merge this. Thanks !

@smuzaffar
Copy link
Contributor

+externals

lets get this in IBs

@smuzaffar smuzaffar merged commit 55a2a47 into cms-sw:IB/CMSSW_15_1_X/master May 26, 2025
26 of 27 checks passed
@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_15_1_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @rappoccio, @sextonkennedy, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@fwyzard fwyzard deleted the IB/CMSSW_15_1_X/master_generic_mpi branch September 14, 2025 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants