Skip to content

Commit dcd175f

Browse files
jdeamicisenrico-usai
authored andcommitted
Add Changelog entry for NVIDIA GPU scontrol reboot bug
Signed-off-by: Jacopo De Amicis <[email protected]>
1 parent 941259d commit dcd175f

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This file is used to list changes made in each version of the AWS ParallelCluste
1717
- Enforce the DCV Authenticator Server to use at least `TLS-1.2` protocol when creating the SSL Socket.
1818
- Load kernel module [nvidia-uvm](https://developer.nvidia.com/blog/unified-memory-cuda-beginners/) by default to provide Unified Virtual Memory (UVM) functionality to the CUDA driver.
1919
- Install [NVIDIA Persistence Daemon](https://docs.nvidia.com/deploy/driver-persistence/index.html) as a system service.
20-
- Install [NVIDIA Data Center GPU Manager (DCGM)](https://developer.nvidia.com/dcgm) package on all supported OSes except for aarch64 `centos7` and `alinux2`.
20+
- Install [NVIDIA Data Center GPU Manager (DCGM)](https://developer.nvidia.com/dcgm) package on all supported OSes except for aarch64 `centos7` and `alinux2`.
2121

2222
**CHANGES**
2323
- Upgrade Slurm to version 23.02.2.
@@ -51,6 +51,7 @@ This file is used to list changes made in each version of the AWS ParallelCluste
5151
- Fix an issue that was causing misalignment of compute nodes IP on instances with multiple network interfaces.
5252
- Fix replacement of `StoragePass` in `slurm_parallelcluster_slurmdbd.conf` when a queue parameter update is performed and the Slurm accounting configurations are not updated.
5353
- Fix issue causing `cfn-hup` daemon to fail when it gets restarted.
54+
- Fix issue causing NVIDIA GPU compute nodes not to resume correctly after executing an `scontrol reboot` command.
5455

5556
3.5.1
5657
------

0 commit comments

Comments
 (0)