You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, I was reviewing the CVE-2025-23359 security bulletin and noticed that the vulnerability does not affect CDI mode. While this is reassuring, I’d like to kindly ask for clarification on how CUDA Forward Compatibility is handled in CDI mode, particularly for containers built with newer CUDA Toolkits running on nodes with older NVIDIA Linux GPU drivers.
After inspecting /etc/cdi/nvidia.yaml, I see that nvidia-cdi-hook injects the path(e.g., /usr/lib64) which host’s libcuda path mount into the container’s /etc/ld.so.conf.d/00-nvcr-<RANDOM_STRING>.conf. However, I’m uncertain how this ensures compatibility for applications requiring CUDA Forward Compatibility (e.g., binding /usr/local/cuda/compat libraries). For example, if a container built with CUDA 12.2 (requiring driver ≥535) runs on a host with driver 525, I don’t see mechanisms in CDI specs to automatically include compatibility stubs.
I also came across PR #906, which introduced nvidia-cdi-hook compat-libs --driver-version 999.88.77 to address Forward Compatibility. This makes me wonder:
Before #906: Was CDI mode inherently unable to support CUDA Forward Compatibility due to missing library bindings?
After #906: Does enabling compatibility now require manual configuration (e.g., specifying --driver-version), or is this handled automatically in CDI spec generation?
The text was updated successfully, but these errors were encountered:
Without the changes in #906 CDI mode does not support forward compatibility. After #906 has been merged and is generally available, there should be no user input and the CDI spec generation should take care of injecting the correct --driver-version for the hook.
Without the changes in #906 CDI mode does not support forward compatibility. After #906 has been merged and is generally available, there should be no user input and the CDI spec generation should take care of injecting the correct --driver-version for the hook.
Thanks for clarifying! We're observing CUDA Forward Compatibility hiccups in v1.17.4 and plan to adopt the release containing #906 changes. Could you share the expected timeline for the next version rollout? This will help us schedule the upgrade smoothly.
hi, I was reviewing the CVE-2025-23359 security bulletin and noticed that the vulnerability does not affect CDI mode. While this is reassuring, I’d like to kindly ask for clarification on how CUDA Forward Compatibility is handled in CDI mode, particularly for containers built with newer CUDA Toolkits running on nodes with older NVIDIA Linux GPU drivers.
After inspecting /etc/cdi/nvidia.yaml, I see that nvidia-cdi-hook injects the path(e.g., /usr/lib64) which host’s libcuda path mount into the container’s
/etc/ld.so.conf.d/00-nvcr-<RANDOM_STRING>.conf
. However, I’m uncertain how this ensures compatibility for applications requiring CUDA Forward Compatibility (e.g., binding /usr/local/cuda/compat libraries). For example, if a container built with CUDA 12.2 (requiring driver ≥535) runs on a host with driver 525, I don’t see mechanisms in CDI specs to automatically include compatibility stubs.I also came across PR #906, which introduced nvidia-cdi-hook compat-libs --driver-version 999.88.77 to address Forward Compatibility. This makes me wonder:
Before #906: Was CDI mode inherently unable to support CUDA Forward Compatibility due to missing library bindings?
After #906: Does enabling compatibility now require manual configuration (e.g., specifying --driver-version), or is this handled automatically in CDI spec generation?
The text was updated successfully, but these errors were encountered: