Skip to content
Discussion options

You must be logged in to vote

Hi,

Right now, we haven't finished the implementation for 1 rank -> multi-GPUs. A PR in the near future will likely fix this issue.

OpenSn relies on the GPU assignment of SLURM.

For multi-GPU usage, you must run with multi-threading. In your case, I presume that there is 128 CPU cores and 8 GPUs per node, and any CPU in the node can access any GPU in the same node. This averages to 16 (=128/8) CPU cores per GPU. You can try running the code with:

srun -N 1 -c 16 --gpus-per-task=1 -n 8 python3 .... # for 1 node
srun -N 2 -c 16 --gpus-per-task=1 -n 16 python3 .... # for 2 nodes

Or with sbatch:

#SBATCH --nodes=1
#SBATCH --gpus-per-task=1
#SBATCH --exclusive

srun -n 8 -c 16 python ...

Plea…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by quocdang1998
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants