Skip to content

Commit 6aa5b7e

Browse files
committed
Brought presentation in line with work
1 parent cf54996 commit 6aa5b7e

File tree

1 file changed

+29
-28
lines changed

1 file changed

+29
-28
lines changed

docs/IntroductionToBaskerville.qmd

Lines changed: 29 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -186,12 +186,12 @@ job_id = $(sbatch --parsable <slurm file>)
186186
Job name is used throughout slurm, so change it to something more readable than the script name:
187187

188188
````{.bash }
189-
#SBATCH --job-name "AMoreReadableName"
189+
#SBATCH --job-name "A_More_Readable_Name"
190190
````
191191

192192
<br/>
193193

194-
````{.default filename='#SBATCH --job-name "AMoreReadableName"' code-line-numbers="|7"}
194+
````{.default filename='#SBATCH --job-name "A_More_Readable_Name"' code-line-numbers="|7"}
195195
[allsoppj@bask-pg0310u18a BasicSlurmFile]$ cat slurm-474832.out
196196
197197
This script is running on bask-pg0308u24a.cluster.baskerville.ac.uk
@@ -232,6 +232,33 @@ Don't try <br>
232232

233233
Use a template slurm file and substitute values into a new script per job with the **sed** bash command .
234234

235+
236+
## Change the hardware you want you job to run on
237+
<br/>
238+
Baskerville has two types of GPU,
239+
240+
+ **A100-40** (default)
241+
+ **A100-80**
242+
````{.bash}
243+
#SBATCH --constraint=a100_80
244+
````
245+
246+
## Change the number of nodes or GPUs
247+
<br/>
248+
````{.bash}
249+
#SBATCH --gpus 3
250+
#SBATCH --nodes 1
251+
````
252+
Documented in more detail in the [docs.baskerville.ac.uk](docs.baskerville.ac.uk){.external target="_blank"}
253+
254+
+ Leave **\-\-cpus-per-gpu=36** . Resources such as CPUs and Memory are allocated per GPU.
255+
+ All Compute nodes have 4 GPUs.
256+
+ So adding more than 4 GPUs on a single node will fail.
257+
258+
See **6-MoreResources** to show loading PyTorch and difference between selecting 1 and more GPUs.
259+
260+
+ Estimated job start time (worst case) use **squeue -j=<job ID> --start**
261+
235262
## Slurm Arrays - Run many jobs from one Slurm file
236263

237264
<br>
@@ -267,29 +294,3 @@ See **2-arrayJobConfig.sh ** for information on loading a different config in ea
267294

268295

269296
Used this approach to run nearly 700,000 jobs in blocks of 4000, 500 at a time.
270-
271-
## Change the hardware you want you job to run on
272-
<br/>
273-
Baskerville has two types of GPU,
274-
275-
+ **A100-40** (default)
276-
+ **A100-80**
277-
````{.bash}
278-
#SBATCH --constraint=a100_80
279-
````
280-
281-
## Change the number of nodes or GPUs
282-
<br/>
283-
````{.bash}
284-
#SBATCH --gpus 3
285-
#SBATCH --nodes 1
286-
````
287-
Documented in more detail in the [docs.baskerville.ac.uk](docs.baskerville.ac.uk){.external target="_blank"}
288-
289-
+ Leave **\-\-cpus-per-gpu=36** . Resources such as CPUs and Memory are allocated per GPU.
290-
+ All Compute nodes have 4 GPUs.
291-
+ So adding more than 4 GPUs on a single node will fail.
292-
293-
See **6-MoreResources** to show loading PyTorch and difference between selecting 1 and more GPUs.
294-
295-
+ Estimated job start time (worst case) use **squeue -j=<job ID> --start**

0 commit comments

Comments
 (0)