Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-ctk doesn't respect spec-dir override #939

Open
antverpp opened this issue Feb 25, 2025 · 1 comment
Open

nvidia-ctk doesn't respect spec-dir override #939

antverpp opened this issue Feb 25, 2025 · 1 comment

Comments

@antverpp
Copy link

$ nvidia-ctk --version
NVIDIA Container Toolkit CLI version 1.17.4
commit: 9b69590

$ nvidia-ctk config
disable-require = false
supported-driver-capabilities = "compat32,compute,display,graphics,ngx,utility,video"
[nvidia-container-cli]
environment = []
ldconfig = "@/sbin/ldconfig"
load-kmods = true
no-cgroups = true
[nvidia-container-runtime]
debug = "/app/home/podman/.local/nvidia-container-runtime.log"
log-level = "info"
mode = "auto"
runtimes = ["docker-runc", "runc", "crun"]
[nvidia-container-runtime.modes]
[nvidia-container-runtime.modes.cdi]
annotation-prefixes = ["cdi.k8s.io/"]
default-kind = "nvidia.com/gpu"
spec-dirs = ["/etc/cdi", "/var/run/cdi", "/app/home/podman/cdi"]
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
[nvidia-container-runtime-hook]
path = "nvidia-container-runtime-hook"
skip-mode-detection = false
[nvidia-ctk]
path = "nvidia-ctk"
$ nvidia-ctk cdi list --spec-dir "/app/home/podman/cdi"
INFO[0000] Found 5 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=1
nvidia.com/gpu=GPU-a69be7f2-776f-034f-d202-ad700be58eac
nvidia.com/gpu=GPU-b6705994-caa0-dad7-095d-554686d20f12
nvidia.com/gpu=all
$ nvidia-ctk cdi list
INFO[0000] Found 0 CDI devices

Problem: I run nvidia-ctk generate and put nvidia.yml to the custom directory being as a non-privileged user.
I add env XDG_CONFIG_HOME to my home dir and create nvidia-container-runtime there, where I re-define config.toml with custom spec-dirs values.
And if I run nvidia-ctk config - I see that the configuration at least can be read fine from my custom config file.
So I put nvidia.yml to /app/home/podman/cdi.
And nvidia-ctk cdi list show nothing.
But it shows devices if I run with flag --spec-dir, so it defenitely should work.
I expect nvidia-ctk cdi list to show devices without additional flag as soon as in the config I redefine spec-dirs.
What else am I missing here?

@elezar
Copy link
Member

elezar commented Feb 25, 2025

Most nvidia-ctk commands do not currently load settings from the config.toml file. The one exception is nvidia-ctk config which processes the same config file that the nvidia-container-runtime would.

We can look into whether it makes sense to change this behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants