-
Notifications
You must be signed in to change notification settings - Fork 3
More Eigen GPU work, with fix for CUDA 11.4 #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CUDA version 11.3.0 and later is more strict in accessing non-__device__ static constants from __device__ functions. Explicitly marking these constants with EIGEN_DEVICE_CONST makes them available both for host and device code.
|
A new Pull Request was created by @fwyzard (Andrea Bocci) for branch cms/master/f612df273689a19d25b45ca4f8269463207c4fee. @cmsbuild, @smuzaffar, @mrodozov, @iarspider can you please review it and eventually sign? Thanks. |
|
@cmsbuild, please test for CMSSW_12_1_X/slc7_amd64_gcc11 |
|
@cmsbuild, please test |
This comment has been minimized.
This comment has been minimized.
Mark Eigen::fix<N> as EIGEN_DEVICE_CONST so it is available both on host and device code. Mark additional methods as EIGEN_CONSTEXPR EIGEN_DEVICE_FUNC. Some require C++14 for relaxed constexpr.
df554ae to
2916568
Compare
|
please test |
|
Pull request #7 was updated. |
|
please test for CMSSW_12_1_X/slc7_amd64_gcc11 |
|
-1 Failed Tests: Build BuildI found compilation error when building: >> Cuda Device Link tmp/slc7_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuPrefixScan_t/gpuPrefixScan_t_cudadlink.o
>> Building binary gpuPrefixScan_t
Copying tmp/slc7_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuPrefixScan_t/gpuPrefixScan_t to productstore area:
>> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/HeterogeneousCore/CUDAUtilities/test/radixSort_t.cu
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/HeterogeneousCore/CUDAUtilities/test/radixSort_t.cu: In function 'void go(bool) [with T = float; int NS = 2; U = float; LL = double]':
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/HeterogeneousCore/CUDAUtilities/test/radixSort_t.cu:145:15: error: array subscript 'long unsigned int[0]' is partly outside array bounds of 'float [1]' [-Werror=array-bounds]
145 | *k = (*k >> sh) << sh;
| ~~~~^~~~~~~~
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_1_X_2021-08-16-1100/src/HeterogeneousCore/CUDAUtilities/test/radixSort_t.cu:139:6: note: while referencing 'k1'
139 | auto k1 = a[ind[j]];
| ^
|
Looks like cms-sw/cmssw#34918 is fixed. |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ddb31b/17842/summary.html Comparison SummarySummary:
|
|
please test with cms-sw/cmssw#34929 for CMSSW_12_1_X/slc7_amd64_gcc11 |
|
Looks like cms-sw/cmssw#34918 and cms-sw/cmssw#34917 are both fixed. |
|
please test |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ddb31b/17861/summary.html Comparison SummarySummary:
|
|
+externals |
2b04365
into
cms-externals:cms/master/f612df273689a19d25b45ca4f8269463207c4fee
|
This pull request is fully signed and it will be integrated in one of the next cms/master/f612df273689a19d25b45ca4f8269463207c4fee IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
No description provided.