You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 2, 2021. It is now read-only.
Hi, I'm trying to set GPU monitoring via Grafana/Prometheus. I have stand alone server with two GPUs and use dcgm-exporter in docker container as metrics exporter. I run docker in privileged mode by command docker run -d -e --priveleged -v /home/dockeradm/nvidia-smi-exporter/default-counters.csv:/etc/dcgm-exporter/default-counters.csv -p9400:9400 nvcr.io/nvidia/k8s/dcgm-exporter:2.0.13-2.1.2-ubuntu18.04 , and it see my GPUs. But it can't detect GPU processes and GPU Memory Usage.
There is output of nvidia-smi util from host
]$ nvidia-smi
Mon Aug 23 23:03:29 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:37:00.0 Off | 0 |
| N/A 60C P0 42W / 250W | 1393MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-PCIE... Off | 00000000:86:00.0 Off | 0 |
| N/A 64C P0 47W / 250W | 10095MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 17748 C ...189c/arasov/bin/python3.7 0MiB |
| 0 N/A N/A 53799 C ...189c/arasov/bin/python3.7 1389MiB |
| 1 N/A N/A 17748 C ...189c/arasov/bin/python3.7 10091MiB |
| 1 N/A N/A 53799 C ...189c/arasov/bin/python3.7 0MiB |
+-----------------------------------------------------------------------------+
and there is the output of nvidia-smi inside the container
root@ccdc999ac0bd:/# nvidia-smi
Mon Aug 23 19:25:22 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:37:00.0 Off | 0 |
| N/A 59C P0 41W / 250W | 1393MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-PCIE... Off | 00000000:86:00.0 Off | 0 |
| N/A 62C P0 46W / 250W | 10095MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Am I missing something or doing something wrong? How should I set container to make it detect GPU processes and GPU usage?
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi, I'm trying to set GPU monitoring via Grafana/Prometheus. I have stand alone server with two GPUs and use dcgm-exporter in docker container as metrics exporter. I run docker in privileged mode by command
docker run -d -e --priveleged -v /home/dockeradm/nvidia-smi-exporter/default-counters.csv:/etc/dcgm-exporter/default-counters.csv -p9400:9400 nvcr.io/nvidia/k8s/dcgm-exporter:2.0.13-2.1.2-ubuntu18.04
, and it see my GPUs. But it can't detect GPU processes and GPU Memory Usage.There is output of nvidia-smi util from host
and there is the output of nvidia-smi inside the container
Am I missing something or doing something wrong? How should I set container to make it detect GPU processes and GPU usage?
The text was updated successfully, but these errors were encountered: