Skip to content

Conversation

@makortel
Copy link

@makortel makortel commented Dec 4, 2018

The CachingHostAllocator uses an associated CUDA stream and device for "asynchronous free" (to support the creation+transfer of "me" pointer in data formats

auto view = cs->make_host_unique<DeviceConstView>(stream);
view->xx_ = xx_d.get();
view->yy_ = yy_d.get();
view->adc_ = adc_d.get();
view->moduleInd_ = moduleInd_d.get();
view_d = cs->make_device_unique<DeviceConstView>(stream);
cudaCheck(cudaMemcpyAsync(view_d.get(), view.get(), sizeof(DeviceConstView), cudaMemcpyDefault, stream.id()));
}

). The implementation missed one detail regarding multiple GPUs: when claiming a previously-cached memory block, the current device may differ from the device of the previous allocation, and in that case, the CUDA event must be re-created for the new device.

This PR fixes that behavior, and should fix the crashes reported in #208 (comment).

@fwyzard

@fwyzard fwyzard merged commit 4684349 into cms-patatrack:CMSSW_10_4_X_Patatrack Dec 7, 2018
@fwyzard fwyzard added this to the CMSSW_10_4_0_pre3_Patatrack milestone Dec 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants