Add new machine perlmutter
#464
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This pull request introduces a configuration file and job submission scripts for running GAMER on the NERSC Perlmutter supercomputer, adding a new configuration file tailored for Perlmutter's environment and two job submission scripts for CPU and GPU nodes.
Changes
Configuration file:
configs/perlmutter.config: Added a new configuration file tailored for the NERSC Perlmutter system, including paths forCUDA,FFTW3,MPI, andHDF5libraries, compiler settings, and GPU-specific flags for NVIDIA A100 GPUs.Job Submission Scripts:
example/queue/submit_perlmutter_cpu.job: Added a new SLURM job submission script for running jobs on Perlmutter's CPU nodes.example/queue/submit_perlmutter_gpu.job: Added a new SLURM job submission script for running jobs on Perlmutter's GPU nodes.Usage
Before doing anything, please:
Tests
For the regression test:
python regression_test.py --machine=perlmutter -e level2But the regression test itself has problem on running on other machine. So this is just for a reference.
Performance
3D blastwave
256x256x256+
MAX_LEVEL=3:4 GPU nodes: 682.3 s.
2 GPU nodes: 700.6 s.
1 GPU nodes: 1029.6 s.
4 CPU nodes: 1551.0 s.
2 CPU nodes: 2585.1 s.
1 CPU nodes: 4444.3 s.
128x128x128+
MAX_LEVEL=3:4 GPU nodes: 354.8 s.
2 GPU nodes: 265.6 s.
1 GPU nodes: 329.3 s.