Follow these instructions to set up and test a GPU environment for the ICS GPU tutorial on CS01.
Navigate to your scratch directory on CS01:
cd /scratch/<solis-id>Replace <solis-id> with your actual SOLIS ID.
Clone the ICS GPU tutorial repository from GitHub:
git clone https://github.com/janvaneck1994/ICS-GPU-tutorial.gitMake it your working dir
cd ICS-GPU-tutorialLoad the Python module on CS01:
module load pythonCheck if the correct version of Python is loaded:
which pythonYou should see the output similar to:
alias python='python3.9'
/usr/bin/python3.9
Create a new Python virtual environment for the tutorial:
python -m venv ics_gpu_tutorialActivate the created virtual environment:
source ics_gpu_tutorial/bin/activateCheck if the correct version of Python is being used in the virtual environment:
which pythonYou should see the output similar to:
alias python='python3.9'
/storage/scratch/<solis-id>/ICS-GPU-tutorial/ics_gpu_tutorial/bin/python3.9
Ensure you replace <solis-id> with your actual SOLIS ID.
Install PyTorch and torchvision in your newly created venv using pip:
pip install torch torchvisionSubmit the test job to SLURM. This job will check if the gpu is available from pytorch:
sbatch test_gpu.shAfter the job has succeeded, check the output file test_gpu.out:
You should see:
GPU is available.
This confirms that the GPU is properly configured and available for use.
You can now run the MNIST example:
Make mnist_example your working dir:
cd mnist_exampleRun a job to train and test a FNN on the MNIST dataset
sbatch test_mnist.shAfter the job is complete you can inspect the output in test_mnist.out:
GPU is available.
...
...
...
Test set: Average loss: 0.0001, Accuracy: 9626/10000 (96%)