Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom NVIDIA CTK path #72

Merged
merged 1 commit into from
Feb 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmd/nvidia-dra-plugin/cdi.go
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ func NewCDIHandler(opts ...cdiOption) (*CDIHandler, error) {
nvcdi.WithMode("nvml"),
nvcdi.WithVendor(h.vendor),
nvcdi.WithClass(h.class),
nvcdi.WithNVIDIACTKPath(h.nvidiaCTKPath),
)
if err != nil {
return nil, fmt.Errorf("unable to create CDI library: %w", err)
Expand Down
2 changes: 2 additions & 0 deletions deployments/helm/k8s-dra-driver/templates/kubeletplugin.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ spec:
resources:
{{- toYaml .Values.kubeletPlugin.containers.plugin.resources | nindent 10 }}
env:
- name: NVIDIA_CTK_PATH
value: "{{ .Values.nvidiaCtkPath }}"
Comment on lines +79 to +80
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also require us to set a default in values.yaml?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On:

Does this also require us to set a default in values.yaml?
The nvcdi package will default to /usr/bin/nvidia-ctk if this is empty, but I would assume that an explicit entry in values.yaml would be better.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

I think this depends on the API that the values file represents. I suppose from my perspective, the current options in the kubeletplugin section seem related to that pod specifically, whereas the driver root and this path are not specific to the plugin. Yes, they're currently only used there, but if another component is added here that requires these values where would be specify them then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also require us to set a default in values.yaml?

Good point. Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

Hmm, I can fix that. Where do they should go, in your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@empovit I think this is probably more of a question for us internally than a serious issue with the PR.

@klueska do you want to block this PR on defining some other top-level structure to define these kinds of properties (driver, host?) or are you ok with leaving this as a top-level value for the time being?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it's fine. Just an observation (but one that has implications on the user-facing API, so I want us to be conscious of what we are doing).

- name: NVIDIA_DRIVER_ROOT
value: "{{ .Values.nvidiaDriverRoot }}"
- name: NVIDIA_VISIBLE_DEVICES
Expand Down
5 changes: 5 additions & 0 deletions deployments/helm/k8s-dra-driver/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@
# For driver installed directly on a host, a value of `/` is used.
nvidiaDriverRoot: /

# Specify the path of CTK binary (nvidia-ctk) on the host,
# as it should appear in the the generated CDI specification.
# The path depends on the system that runs on the node.
nvidiaCtkPath: /usr/bin/nvidia-ctk

nameOverride: ""
fullnameOverride: ""
namespaceOverride: ""
Expand Down