-
Notifications
You must be signed in to change notification settings - Fork 207
PyTorch: fix configuration of CUDA and cuDNN [15.0.x] #9859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch: fix configuration of CUDA and cuDNN [15.0.x] #9859
Conversation
|
A new Pull Request was created by @fwyzard for branch IB/CMSSW_15_0_X/master. @cmsbuild, @iarspider, @smuzaffar can you please review it and eventually sign? Thanks.
|
|
cms-bot internal usage |
|
backport #9860 |
|
please test |
|
Pull request has been put on hold by @fwyzard |
7014228 to
5950ff0
Compare
|
Pull request #9859 was updated. |
|
please test |
|
-1 Failed Tests: Build BuildI found compilation error when building: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_15_0_X_2025-05-13-2300/external/el8_amd64_gcc12/lib/libtorch_cuda.so: undefined reference to `cudnnConvolutionBiasActivationForward' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_15_0_X_2025-05-13-2300/external/el8_amd64_gcc12/lib/libtorch_cuda.so: undefined reference to `cudnnSpatialTfGridGeneratorBackward' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_15_0_X_2025-05-13-2300/external/el8_amd64_gcc12/lib/libtorch_cuda.so: undefined reference to `cudnnBackendSetAttribute' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_15_0_X_2025-05-13-2300/external/el8_amd64_gcc12/lib/libtorch_cuda.so: undefined reference to `cudnnSetConvolutionMathType' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_15_0_X_2025-05-13-2300/external/el8_amd64_gcc12/lib/libtorch_cuda.so: undefined reference to `cudnnDestroySpatialTransformerDescriptor' collect2: error: ld returned 1 exit status >> Deleted: tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testTorch/testTorch gmake: *** [tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testTorch/testTorch] Error 1 >> Compiling src/PhysicsTools/PyTorch/test/testRunner.cc /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DBOOST_SPIRIT_THREADSAFE -DPHOENIX_THREADSAFE -DBOOST_MATH_DISABLE_STD_FPCLASSIFY -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -DCMSSW_GIT_HASH='CMSSW_15_0_X_2025-05-13-2300' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_15_0_X_2025-05-13-2300' -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02889/el8_amd64_gcc12/cms/cmssw/CMSSW_15_0_X_2025-05-13-2300/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/pytorch/2.4.0-cac6549d087ed8f720bd68fc07f673de/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/pytorch/2.4.0-cac6549d087ed8f720bd68fc07f673de/include/torch/csrc/api/include -isystem/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/boost/1.80.0-f9d55d46c162407ba3033b506563d0e2/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cppunit/1.15.x-fb84a4bbf5a436317d208e3ef0864e91/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/zlib/1.2.13-d217cdbdd8d586e845e05946de2796be/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testTorchSimpleDnn/testRunner.cc.d src/PhysicsTools/PyTorch/test/testRunner.cc -o tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testTorchSimpleDnn/testRunner.cc.o >> Compiling src/PhysicsTools/PyTorch/test/testTorchSimpleDnn.cc |
5950ff0 to
dfdacc2
Compare
|
Pull request #9859 was updated. |
|
please test |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a16fd0/46135/summary.html Comparison SummarySummary:
|
|
type bugfix |
|
please test lets refresh tests |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a16fd0/46255/summary.html Comparison SummarySummary:
|
|
+externals |
|
This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_15_0_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @antoniovilela, @sextonkennedy, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2) |
|
+1 |
Fix the configuration of CUDA and cuDNN in PyTorch and related tools.