Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update and clean up the quickstart README file #112

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions demo/specs/mig/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#### Show the current MIG configuration of the machine
```console
nvidia-smi --query-gpu=index,name,uuid,mig.mode.current --format=csv
nvidia-smi -L
```

#### Show current state of the cluster
```console
kubectl get pod -A
```

#### Show the yaml files for MIG example apps:
```console
vim -O gpu-test4.yaml gpu-test5.yaml gpu-test6.yaml
```

#### Deploy the 3 MIG example apps above:
```console
kubectl apply --filename=gpu-test{4,5,6}.yaml
```

#### Show all the pods starting up:
```console
kubectl get pod -A -l app=pod
```

#### Show the output of nvidia-smi:
```console
nvidia-smi -L
```

#### Show the MIG devices allocated to each pod in gpu-test4
```console
for pod in \
$(kubectl get pod \
-n gpu-test4 \
--output=jsonpath='{.items[*].metadata.name}'); \
do \
echo "${pod}:"
kubectl logs -n gpu-test4 ${pod} -c ctr0
kubectl logs -n gpu-test4 ${pod} -c ctr1
kubectl logs -n gpu-test4 ${pod} -c ctr2
kubectl logs -n gpu-test4 ${pod} -c ctr3
echo ""
done
```

#### Delete this MIG examples:
```console
kubectl delete --filename=gpu-test{4,5,6}.yaml
```

#### Show the pods terminating:
```console
kubectl get pods -A -l app=pod
```

#### Show the output of nvidia-smi
```console
nvidia-smi -L
```
File renamed without changes.
File renamed without changes.
File renamed without changes.
56 changes: 10 additions & 46 deletions demo/specs/quickstart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,76 +3,40 @@
kubectl get pod -A
```

#### Show the current MIG configuration of the machine
#### Show the yaml files for the first 3 example apps discussed in the [KubeCon presentation](https://sched.co/1R2oG)
```console
nvidia-smi --query-gpu=index,name,uuid,mig.mode.current --format=csv
nvidia-smi -L
vim -O gpu-test1.yaml gpu-test2.yaml gpu-test3.yaml
```

#### Deploy the 4 example apps discussed in the slides
#### Deploy the 3 example apps above
```console
kubectl apply --filename=gpu-test{1,2,3,4}.yaml
kubectl apply --filename=gpu-test{1,2,3}.yaml
Comment on lines -14 to +13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting -- I though I had purposfully removed demo 4 from this READMe for exactly this reason. Maybe I only did it here, but not in our actual driver repo: https://github.com/kubernetes-sigs/dra-example-driver

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- I see now. This is the file down in the quickstart folder -- yeah, this was just copied from my demo script when I presented this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that test-4 (MIG demo) is different from the other three. We also describe test-4 separately in detail later. I think the descriptions would be clearer if we separated them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to update this to only include examples that can be run on any GPU and separate out the MIG use cases to another folder.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, test-5/6 are MIG examples too. We didn't even describe them in README. Let me update the PR.

Copy link
Collaborator Author

@yuanchen8911 yuanchen8911 May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a new folder mig and moved the MIG example gpu-test4/5/6 to it and updated README files accordingly.

```

#### Show all the pods starting up
```console
kubectl get pod -A
```

#### Show the yaml files for the first 3 example apps
```console
vim -O gpu-test1.yaml gpu-test2.yaml gpu-test3.yaml
```

#### Show the GPUs allocated to each
```console
kubectl logs -n gpu-test1 -l app=pod
kubectl logs -n gpu-test2 pod --all-containers
kubectl logs -n gpu-test3 -l app=pod
```

#### Show the yaml file for the complicated example with MIG devices
```console
vim -O gpu-test4.yaml
```

#### Show the pods running
```console
kubectl get pod -A
```

#### Show the output of nvidia-smi
```console
nvidia-smi -L
```
#### Show the MPS (Multi-Process Service) example

#### Show the MIG devices allocated to each pod
```console
for pod in \
$(kubectl get pod \
-n gpu-test4 \
--output=jsonpath='{.items[*].metadata.name}'); \
do \
echo "${pod}:"
kubectl logs -n gpu-test4 ${pod} -c ctr0
kubectl logs -n gpu-test4 ${pod} -c ctr1
kubectl logs -n gpu-test4 ${pod} -c ctr2
kubectl logs -n gpu-test4 ${pod} -c ctr3
echo ""
done
vim -O gpu-test-mps.yaml
```

#### Delete this example
#### Deploy the MPS example
```console
kubectl delete -f gpu-test4.yaml
```

#### Show the pods terminating
```console
kubectl get pod -A
kubectl apply -f gpu-test-mps.yaml
```

#### Show the output of nvidia-smi
#### Show the pod running
```console
nvidia-smi -L
kubectl get pod -n sharing-demo -l app=pod
```