Add HPC docs page. (#278)

jatkinson1000 · jwallwork23 · web-flow · commit a22d08e140cb · 2025-02-11T10:40:26.000Z
* Add HPC docs page with placeholder titles. * Consolidate linker information from other pages into one location. * Typo fixes courtesy of @jwallwork23 Co-authored-by: Joe Wallwork <22053413+jwallwork23@users.noreply.github.com> * Add info on linking and compiling files as suggested by @TomMelt. --------- Co-authored-by: Joe Wallwork <22053413+jwallwork23@users.noreply.github.com>
diff --git a/pages/cmake.md b/pages/cmake.md
@@ -130,25 +130,12 @@ when running CMake.
 
 ## Building other projects with make
 
-To build a project with make you need to include the FTorch library when compiling
+To build a project with `make` you need to include the FTorch library when compiling
 and link the executable against it.
 
-To compile with make add the following compiler flag when compiling files that
-use ftorch:
-```
-FCFLAGS += -I<path/to/install/location>/include/ftorch
-```
+For full details of the flags to set and the linking process see the
+[HPC build pages](page/hpc.html).
 
-When compiling the final executable add the following link flag:
-```
-LDFLAGS += -L<path/to/install/location>/lib64 -lftorch
-```
-
-You may also need to add the location of the `.so` files to your `LD_LIBRARY_PATH`
-unless installing in a default location:
-```
-export LD_LIBRARY_PATH = $LD_LIBRARY_PATH:<path/to/installation>/lib64
-```
 
 ## Conda Support
 
diff --git a/pages/examples.md b/pages/examples.md
@@ -116,26 +116,11 @@ and using the `-DCMAKE_PREFIX_PATH=</path/to/install/location>` flag when runnin
 > then you should use the same path for `</path/to/install/location>`._
 
 ##### Make
-To build with make we need to include the library when compiling and link the executable
-against it.
+To build with `make` we need to _include_ the library and _link_ the
+executable against it when compiling.
 
-To compile with make we need add the following compiler flag when compiling files that
-use FTorch:
-```
-FCFLAGS += -I<path/to/install/location>/include/ftorch
-```
-
-When compiling the final executable add the following link flag:
-```
-LDFLAGS += -L<path/to/install/location>/lib -lftorch
-```
-
-You may also need to add the location of the `.so` files to your `LD_LIBRARY_PATH`
-unless installing in a default location:
-```
-export LD_LIBRARY_PATH = $LD_LIBRARY_PATH:<path/to/install/location>/lib
-```
-> Note: _Depending on your system and architecture `lib` may be `lib64` or something similar._
+For full details of the flags to set and the linking process see the
+[HPC build pages](page/hpc.html/#building-projects-and-linking-to-ftorch).
 
 ### Running on GPUs
 
diff --git a/pages/hpc.md b/pages/hpc.md
@@ -0,0 +1,154 @@
+title: Guidance for use in High Performance Computing (HPC)
+
+[TOC]
+
+A common application of FTorch (indeed, the driving one for development) is the
+coupling of machine learning components to models running on HPC systems.
+
+Here we provide some guidance/hints to help with deployment in these settings.
+
+## Installation
+
+### Building for basic use
+
+The basic installation procedure is the same as described in the
+[main documentation](pages/cmake.html) and README, cloning from
+[GitHub](https://github.com/Cambridge-ICCS/FTorch) and building using CMake.
+
+### Obtaining LibTorch
+
+For use on a HPC system we advise linking to an installation of LibTorch rather than
+installing full PyTorch.
+This will reduce the dependencies and remove any requirement of Python.
+LibTorch can be obtained from the
+[PyTorch website](https://pytorch.org/get-started/locally/).
+The assumption here is that any Python/PyTorch development is done elsewhere with a
+model being saved to TorchScript for use by FTorch.
+
+Once you have successfully tested and deployed FTorch in your code we recommend speaking
+to your administrator/software stack manager to make your chosen version of libtorch
+loadable as a `module`.
+This will improve reproducibility and simplify the process for future users on your
+system.
+See the [information below](#libtorch-as-a-module) for further details.
+
+### Environment management
+
+It is important that FTorch is built using the same environment and compilers as the
+software to which it will be linked.
+
+Therefore before starting the build you should ensure that you match the environment to
+that which your code will be built with.
+This will usually be done by using the same `module` commands as you would use to build
+the model:
+```sh
+module purge
+module load ...
+```
+
+Alternatively you may be provided with a shell script that runs these commands and sets
+environment variables etc. that can be sourced:
+```sh
+source model_environment.sh
+```
+
+Complex models with custom build systems may obfuscate this process, and you might need
+to probe the build system/scripts for this information.
+If in doubt speak to the maintainer of the software for your system, or the manager of
+the software stack on the machine.
+
+Because of the need to match compilers it is strongly recommended to specify the
+`CMAKE_Fortran_COMPILER`, `CMAKE_C_COMPILER`, and `CMAKE_CXX_COMPILER` when building
+with CMake to enforce this.
+
+### Building Projects and Linking to FTorch
+
+Whilst we describe how to link to FTorch using CMake to build a project on our main
+page, many HPC models do not use CMake and rely on `make` or more elaborate build
+systems.
+To build a project with `make` or similar you need to _include_ the FTorch's
+header (`.h`) and module (`.mod`) files and _link_ the executable
+to the Ftorch library (e.g., `.so`, `.dll`, `.dylib` depending on your system) when
+compiling.
+
+To compile with make add the following compiler flag when compiling files that
+use ftorch to _include_ the library:
+```sh
+-I<path/to/FTorch/install/location>/include/ftorch
+```
+This is often done by appending to an `FCFLAGS` compiler flags variable or similar:
+```sh
+FCFLAGS += -I<path/to/FTorch/install/location>/include/ftorch
+```
+
+When compiling the final executable add the following _link_ flag:
+```sh
+-L<path/to/FTorch/install/location>/lib64 -lftorch
+```
+This is often done by appending to an `LDFLAGS` linker flags variable or similar:
+```sh
+LDFLAGS += -L<path/to/FTorch/install/location>/lib64 -lftorch
+```
+
+You may also need to add the location of the dynamic library `.so` files to your
+`LD_LIBRARY_PATH` environment variable unless installing in a default location:
+```sh
+export LD_LIBRARY_PATH = $LD_LIBRARY_PATH:<path/to/FTorch/installation>/lib64
+```
+
+> Note: _Depending on your system and architecture `lib` may be `lib64` or something similar._
+
+> Note: _On MacOS devices you will need to set `DYLD_LIBRARY_PATH` rather than `LD_LIBRARY_PATH`._
+
+Whilst experimenting it may be useful to build FTorch using the `CMAKE_BUILD_TYPE=RELEASE`
+CMake flag to allow useful error messages and investigation with debugging tools.
+
+
+### Module systems
+
+Most HPC systems are managed using [Environment Modules](https://modules.sourceforge.net/).
+To build FTorch it is important you
+[match the environment in which you build FTorch to that of the executable](#environment-management)
+by loading the same modules as when building the main code.
+
+As a minimal requirement you will need to load modules for compilers and CMake.
+Further functionalities may require loading of additional modules such as an
+MPI installation and CUDA.
+Some systems may also have pFUnit available as a loadable module to save you needing to
+build from scratch per the documentation if you are running FTorch's test suite.
+
+#### LibTorch as a module
+
+Once you have a working build of FTorch it is advisable to pin the version of LibTorch
+and make it a loadable module to improve reproducibility and simplify the build process
+for subsequent users on the system.
+
+This can be done by the software manager after which you can use
+```sh
+module load libtorch
+```
+or similar instead of downloading the binary from the PyTorch website.
+
+Note that the module name on your system may include additional information about the
+version, compilers used, and a hash code.
+
+#### FTorch as a module 
+
+If there are many users who want to use FTorch on a system it may be worth building
+and making it loadable as a module itself.
+The module should be labelled with the compilers it was built with (see the
+[importance of environment matching](#environment-management)) and automatically load
+any subdependencies (CUDA)
+
+The build should be completed for `CMAKE_BUILD_TYPE=RELEASE` and run the unit tests to
+check successful installation.
+
+Once complete it should be possible to:
+```sh
+module load ftorch
+```
+or similar.
+
+This process should also add FTorch to the `LD_LIBRARY_PATH` and `CMAKE_PREFIX_PATH`
+rather than requiring the user to specify them manually as suggested elsewhere in this
+documentation.