Skip to content

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Nov 22, 2025

Description

Replace old code in host/device/managed mdspan with https://nvidia.github.io/cccl/libcudacxx/extended_api/memory/is_pointer_accessible.html

@fbusato fbusato self-assigned this Nov 22, 2025
@fbusato fbusato added the 3.2.0 Targeted for 3.2.0 release label Nov 22, 2025
@fbusato fbusato requested a review from a team as a code owner November 22, 2025 02:03
@fbusato fbusato added this to CCCL Nov 22, 2025
@fbusato fbusato requested a review from wmaxey November 22, 2025 02:03
@github-project-automation github-project-automation bot moved this to Todo in CCCL Nov 22, 2025
@fbusato fbusato changed the title Applies `is_pointer_accessible to mdspan Applies is_pointer_accessible to mdspan Nov 22, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Nov 22, 2025
@github-actions

This comment has been minimized.

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Nov 22, 2025
@fbusato fbusato moved this from In Progress to In Review in CCCL Nov 24, 2025
@fbusato fbusato requested a review from davebayer November 24, 2025 19:23
@fbusato
Copy link
Contributor Author

fbusato commented Nov 24, 2025

@davebayer @miscco I'm starting thinking that we are doing the wrong thing for host/device/managed.
There are two problems:

  1. We cannot check the pointer validity in respect to the memory space for EVERY access. This involves multiple driver calls. This is too expensive even in debug mode.
  2. We cannot check the device accessibility on the host because its access is always wrong on the host side.

Ideally, mdspan itself should check the pointer during the creating, not during run-time. Two potential solutions:

  • cuda:std::mdpan checks __detectably_invalid.
  • Modify the host/device/managed_mdspan constructs to check the memory space.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Comment on lines +244 to +245
static const auto __dev_id = static_cast<int>(::cuda::__driver::__ctxGetDevice());
return ::cuda::is_device_accessible(::cuda::std::to_address(__p), ::cuda::device_ref{__dev_id});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, we don't want to check for the current device. We should just try to see whether it's accessible from any device here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you suggested adding device_ref at the time. It makes sense, but it also makes sense to have a relaxed version without it, as I initially proposed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created #6918

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Dec 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

😬 CI Workflow Results

🟥 Finished in 2h 01m: Pass: 92%/90 | Total: 2d 02h | Max: 2h 01m | Hits: 84%/197383

See results here.

Comment on lines +199 to 200
NV_IF_TARGET(NV_IS_DEVICE, (_CCCL_VERIFY(false, "cuda::__host_accessor cannot be used in DEVICE code");))
return _Accessor::access(__p, __i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we dropping the check here? that seems like a regression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is on purpose. We don't want to call a driver API for each access of a mdspan. This is overkill

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: there are different ways to store that information during construction. We could either make it a boolean that is set during construction or turn the pointer into a tagged pointer which would be much more onvolved though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to store any extra information. Checking the pointer validity during mdspan (not accessor) construction is enough.

{
if constexpr (::cuda::std::__has_detect_invalidity<accessor_type>)
{
const auto __tmp = mapping(); // workaround for clang with nodiscard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you are declaring this [[maybe_unused]] I believe we should do that in all places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, [[maybe_unused]] is correct because it is not used in release mode. The code is terrible for doing a trivial check, but I don't have better workaround for the clang warning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.2.0 Targeted for 3.2.0 release

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants