@@ -186,12 +186,12 @@ job_id = $(sbatch --parsable <slurm file>)
186
186
Job name is used throughout slurm, so change it to something more readable than the script name:
187
187
188
188
```` {.bash }
189
- # SBATCH --job-name "AMoreReadableName "
189
+ # SBATCH --job-name "A_More_Readable_Name "
190
190
````
191
191
192
192
<br />
193
193
194
- ```` {.default filename='#SBATCH --job-name "AMoreReadableName "' code-line-numbers="|7"}
194
+ ```` {.default filename='#SBATCH --job-name "A_More_Readable_Name "' code-line-numbers="|7"}
195
195
[allsoppj@bask-pg0310u18a BasicSlurmFile]$ cat slurm-474832.out
196
196
197
197
This script is running on bask-pg0308u24a.cluster.baskerville.ac.uk
@@ -232,6 +232,33 @@ Don't try <br>
232
232
233
233
Use a template slurm file and substitute values into a new script per job with the ** sed** bash command .
234
234
235
+
236
+ ## Change the hardware you want you job to run on
237
+ <br />
238
+ Baskerville has two types of GPU,
239
+
240
+ + ** A100-40** (default)
241
+ + ** A100-80**
242
+ ```` {.bash}
243
+ #SBATCH --constraint=a100_80
244
+ ````
245
+
246
+ ## Change the number of nodes or GPUs
247
+ <br />
248
+ ```` {.bash}
249
+ #SBATCH --gpus 3
250
+ #SBATCH --nodes 1
251
+ ````
252
+ Documented in more detail in the [ docs.baskerville.ac.uk] ( docs.baskerville.ac.uk ) {.external target="_ blank"}
253
+
254
+ + Leave ** \-\- cpus-per-gpu=36** . Resources such as CPUs and Memory are allocated per GPU.
255
+ + All Compute nodes have 4 GPUs.
256
+ + So adding more than 4 GPUs on a single node will fail.
257
+
258
+ See ** 6-MoreResources** to show loading PyTorch and difference between selecting 1 and more GPUs.
259
+
260
+ + Estimated job start time (worst case) use ** squeue -j=<job ID > --start**
261
+
235
262
## Slurm Arrays - Run many jobs from one Slurm file
236
263
237
264
<br >
@@ -267,29 +294,3 @@ See **2-arrayJobConfig.sh ** for information on loading a different config in ea
267
294
268
295
269
296
Used this approach to run nearly 700,000 jobs in blocks of 4000, 500 at a time.
270
-
271
- ## Change the hardware you want you job to run on
272
- <br />
273
- Baskerville has two types of GPU,
274
-
275
- + ** A100-40** (default)
276
- + ** A100-80**
277
- ```` {.bash}
278
- #SBATCH --constraint=a100_80
279
- ````
280
-
281
- ## Change the number of nodes or GPUs
282
- <br />
283
- ```` {.bash}
284
- #SBATCH --gpus 3
285
- #SBATCH --nodes 1
286
- ````
287
- Documented in more detail in the [ docs.baskerville.ac.uk] ( docs.baskerville.ac.uk ) {.external target="_ blank"}
288
-
289
- + Leave ** \-\- cpus-per-gpu=36** . Resources such as CPUs and Memory are allocated per GPU.
290
- + All Compute nodes have 4 GPUs.
291
- + So adding more than 4 GPUs on a single node will fail.
292
-
293
- See ** 6-MoreResources** to show loading PyTorch and difference between selecting 1 and more GPUs.
294
-
295
- + Estimated job start time (worst case) use ** squeue -j=<job ID > --start**
0 commit comments