You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"with the resolved kinetic energy per unit mass $\\mf K \\equiv \\frac{1}{2}\\left|\\textbf{u}\\right|^2$. the Bernoulli intefral $H$ obeys the following Poisson equation\n",
141
149
"\n",
142
-
"with the resolved kinetic energy per unit mass $K \\equiv \\frac{1}{2}\\left|\\textbf{u}\\right|^2$. the Bernoulli intefral $H$ obeys the following Poisson equation\n",
"The FDS guides and documentation for version 6.7.5 can be found in [this release](https://github.com/firemodels/fds/releases/tag/FDS6.7.5).\n",
203
216
"\n",
204
217
"### User's Guide \n",
205
-
"An about 300 pages thick manual to introduce users to FDS {cite}`FDS-UG-6.7.5`. It covers the user aspects of \n",
218
+
"An about 400 pages thick manual to introduce users to FDS {cite}`FDS-UG-6.7.5`. It covers the user aspects of \n",
206
219
"* basics of FDS, getting started\n",
207
220
"* structure of FDS input files\n",
208
221
"* building geometric models\n",
@@ -226,7 +239,7 @@
226
239
"* fire detection devices and HVAC\n",
227
240
"\n",
228
241
"### Verification and validation \n",
229
-
"To demonstrate the applicability of the FDS model, there exist two documents (in total about 800 pages) about model verification {cite}`FDS-VE-6.7.5` and validation {cite}`FDS-VA-6.7.5`.\n",
242
+
"To demonstrate the applicability of the FDS model, there exist two documents (in total over 1000 pages) about model verification {cite}`FDS-VE-6.7.5` and validation {cite}`FDS-VA-6.7.5`.\n",
230
243
"All test are run every night to check the impact of source code changes.\n",
231
244
"\n",
232
245
"\n",
@@ -266,7 +279,7 @@
266
279
"## Installation\n",
267
280
"\n",
268
281
"### Source Code and Binary Download\n",
269
-
"The full source code (both FDS and Smokeview) is available at [GitHub](https://github.com/firemodels/fds-smv). This page also includes references to: \n",
282
+
"The full source code (both FDS and Smokeview) is available at [GitHub](https://github.com/firemodels/fds). This page also includes references to: \n",
Copy file name to clipboardExpand all lines: book/content/tools/02_hpc/03_parallel_fds.md
+180-2Lines changed: 180 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,183 @@
1
1
# Parallel Execution of FDS
2
2
3
+
## Accessing JURECA
4
+
5
+
### Accounts
6
+
7
+
You should have received an invitation email, which asks you to register in the account management system. Once registered, you will receive an individual username and be asked to upload your login keys.
8
+
9
+
### SSH
10
+
11
+
To reach the computercluster JURECA you need to log in via the [secure shell protocol (SSH)](https://en.wikipedia.org/wiki/Secure_Shell_Protocol). It is recommened to read the user documention which can be found here: [access JURECA using SSH](https://apps.fz-juelich.de/jsc/hps/jureca/access.html).
12
+
13
+
Most Linux and MacOS systems have a SSH client installed. On Windows, you can use tools like [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html).
14
+
15
+
Depending on your SSH client, there are various ways to genearte a SSH key pair (public and private). In any case, it should be protected with a passphrase.
16
+
17
+
One of the safety measure on JURECA is, that you need to specify the IP range from which you access the system, see [kee restrictions](https://apps.fz-juelich.de/jsc/hps/jureca/access.html#key-upload-key-restriction). If you use VPN, e.g. provided by the University of Wuppertal, your `from` statement could include `*.uni-wuppertal.de`.
18
+
19
+
### Login
20
+
21
+
In order to login to JURECA on a Linux or MacOS you just need to execute
Or you add a configuration to `~/.ssh/config` like
28
+
29
+
```
30
+
Host jureca
31
+
User username1
32
+
Hostname jureca.fz-juelich.de
33
+
```
34
+
35
+
which allow a shorter command
36
+
37
+
```
38
+
> ssh jureca
39
+
```
40
+
41
+
````{admonition} Task
42
+
Login to JURECA and check your username, the server you have logged in to and the path to your home directory. The result should look similar to
43
+
44
+
```
45
+
> whoami
46
+
username1
47
+
> hostname
48
+
jrlogin06.jureca
49
+
> echo $HOME
50
+
/p/home/jusers/username1/jureca
51
+
```
52
+
````
53
+
54
+
55
+
## FDS Module on JURECA
56
+
57
+
Modules offer a flexible environment to manage multiple versions of software. This system is also used on JURECA: [Module usage on JURECA](https://apps.fz-juelich.de/jsc/hps/jureca/software-modules.html).
58
+
59
+
As the FDS (and some other) module are not globaly installed, they need to be added to the user's environment. This can be done with
60
+
61
+
```
62
+
> module use -a ~arnold1/modules_fire/
63
+
```
64
+
65
+
Thus, adding this line to your batch scripts and startup script (`~/.bashrc`) will automatically add the related modules to the module environment.
66
+
67
+
````{admonition} Task
68
+
69
+
* Use the `module avail` command to list all available modules. Make sure, you see also the FDS modules.
70
+
* Load a FSD module and invoke FDS with `fds`. Does the loaded version of FDS correspond to the one you have expected?
71
+
72
+
````
73
+
74
+
## Job Submission
75
+
76
+
A computercluster is often used by a lot of users. Therefore executing a programm which needs a lot of the CPU power could disturb other users by slowing down the rest of the softwares or even the OS. The solution to this is a queueing system which organizes the execution of many programms and manages the ressource distribution among them. JURECA uses the software called Slurm for queueing and distributing to compute nodes. More information is provided in [JURECA's batch system documentation](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html).
77
+
78
+
Instead of running our simulation on the cluster we either submit our simulation file to the queueing system or execute a submit code which includes the modules we need for FDS, sets the number of processes, theads and other important quantities.
79
+
80
+
81
+
### Single Job
82
+
83
+
The structure of a Slurm job script is basically a shell script. This shell script will be executed by the batch system on the requested ressource. The definition of the ressource is done as comments (`#`) with the `SLURM` keywords. These are instructions for Slurm.
84
+
85
+
A simple example for a Slurm job scipt ({download}`fds-slurm-job-single.sh`) is given below. It executes FDS for a single FDS input file.
86
+
87
+
```{literalinclude} ./fds-slurm-job-single.sh
88
+
```
89
+
90
+
91
+
92
+
The individual lines have the following functions
93
+
94
+
***Naming**
95
+
96
+
```#SBATCH --job-name=my_FDS_simulation```
97
+
98
+
You can name your job in order to find it quicker in the job lists. The name has no other function.
99
+
100
+
***Accounting**
101
+
102
+
```#SBATCH --account=ias-7```
103
+
104
+
On Jureca you need to have a computing time project and your account needs to be assinged to it. This budget is used to "buy" a normal/high priority in the queueing system. Using `account` you specify which computing time contingency will be debited for the specified job. Here `ias-7` is indicating the project we will use for this lecture. It is the budget of the IAS-7 at the Forschungszentrum Jülich.
105
+
106
+
***Partition**
107
+
108
+
```#SBATCH --partition=dc-cpu```
109
+
110
+
JURECA's batch system is divided into multipe partitions, which represent different computing architectures. In our case we want to execute the simulation on common CPU cores and therefore we use the partition `dc-cpu` – more information on [JURECA's partitions](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html#slurm-partitions).
111
+
112
+
***MPI Tasks**
113
+
114
+
```#SBATCH --ntasks=128```
115
+
```#SBATCH --cpus-per-task=1```
116
+
117
+
There are different ways how to define the number of requested cores. It is possible to state how many MPI tasks (`ntasks`) are to be started and how many cores each of them (`cpus-per-task`) will get assigned. The product of these will lead to the number of physical cores and thus to the number of nodes to be allocated. In the current configuration of the `dc-cpu` partition, each node has 128 cores. An alternative is to specify the number of nodes, which would lead to the number of MPI tasks to be started.
118
+
119
+
***Terminal Output**
120
+
121
+
```#SBATCH --output=stdout.%j```
122
+
```#SBATCH --error=stderr.%j```
123
+
124
+
Here the file name for the standard output and error log are defined. `%j` will be replaced by the job id generated by Slurm.
125
+
126
+
***Wall Clock Time**
127
+
128
+
```#SBATCH --time=00:30:00```
129
+
130
+
This line specifies the maximum time the job can run on the requested resource. The maximal wall clock time is stated in the documentation for the individual partitions. The `cd-cpu` partition has a limit of 24 hours.
As FDS can utilise OpenMP, the according environment variable (`OMP_NUM_THREADS`) needs to be set. The above command sets it automatically to the number given by `cpus-per-task` in the Slurm section.
Here, first the module environment containig the FDS module is included, then the specified FDS module is loaded.
144
+
145
+
***Execute FDS**
146
+
147
+
```srun fds ./*.fds```
148
+
149
+
A MPI-parallel application is started with `srun` on a Slurm system. If not explicitly stated, like in the above line, the number of MPI tasks is specified by the Slurm environment.
150
+
151
+
```{note}
152
+
It is important to keep in mind, that JURECA's usage concept is to assign compute nodes **exclusively** to a single job. Thus, the resources used are given by the number of nodes used and the wall clock time. In the current setup the `dc-cpu` partition has nodes with 128 cores, so even if jobs use just a few cores, the account is charged with 128 cores (a full node).
153
+
```
154
+
155
+
A Slurm job can be submitted via
156
+
157
+
```
158
+
> sbatch fds-slurm-job-single.sh
159
+
```
160
+
161
+
The current status of a user's queue can be listed with
162
+
163
+
```
164
+
> squeue -u $USER
165
+
```
166
+
167
+
### Chain Jobs
168
+
169
+
As there is a limit of 24 hours on JURECA, each job has to restart after this time. The following scripts automate the process for FDS on JURECA. It is important that the FDS input file has the `RESTART` parameter defined, typically initailly set to `.FALSE.`.
170
+
171
+
The main idea is to invoke multiple jobs ({download}`fds-slurm-chain-starter.sh`) with a dependency, i.e. a chain is created, so that the jobs are consequatively executed.
172
+
173
+
```{literalinclude} ./fds-slurm-chain-starter.sh
174
+
```
175
+
176
+
The individual jobs ({download}`fds-slurm-chain-job.sh`) are similar to the above simple setting. However, they add the functionality to create a `STOP` file to stop FDS before the requested wall clock time is reached. This way restart files are written out by FDS, which can be used for the next chain element. The value of `RESTART` is automatically set to `.TRUE.` after the first execution of FDS.
177
+
178
+
```{literalinclude} ./fds-slurm-chain-job.sh
179
+
```
180
+
3
181
## Mesh Decomposition
4
182
5
183
Python scipt to automate decomposition of a single `MESH` statement: {download}`decompose_fds_mesh.py`.
0 commit comments