Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Venado optimizations #297

Draft
wants to merge 69 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
9d12c58
Add gpmdk for the Venado hackathon
mewall Jul 2, 2024
f2a3c08
Add build script for hackathon
mewall Jul 2, 2024
3efdf37
Update bml submodule
mewall Jul 2, 2024
8c1bcab
Update bml submodule
mewall Jul 3, 2024
48b8022
Add electrons.dat to latteTBparams
mewall Jul 3, 2024
05314a5
Add MPI barrier to pin down performance issue
mewall Jul 5, 2024
ff1bd08
Add venado build, env, and run scripts
mewall Jul 5, 2024
7525e35
Bug fix
mewall Jul 5, 2024
17ad747
Add TrpCage example for gpmdk
mewall Jul 5, 2024
639473f
Fixed line truncations
Jul 8, 2024
61a5be8
Added sedacs partition and field induced forces
Jul 17, 2024
a61c7a6
Added main fro field
Jul 17, 2024
f62b952
NVTX tags
mewall Jul 15, 2024
c1aaa69
Introducing nvtx tags and some optimizations
mewall Jul 17, 2024
e3f9a44
Debug resizing and resize two more arrays
mewall Jul 19, 2024
2c63bed
Resize zqt array and use new bml_transpose_inplace to avoid allocation
mewall Jul 19, 2024
a044282
Move response into gpmdk in preparation for GPU kernel
mewall Jul 19, 2024
dfeb3a1
Add some useful build scripts
mewall Jul 19, 2024
1aebd59
Update bml submodule
mewall Jul 19, 2024
15594cf
Reduce allocations in hcsf method
mewall Jul 19, 2024
11695ec
Update build script
mewall Jul 19, 2024
f1d152c
Modifications to support the new bml_transpose Fortran API
mewall Jul 22, 2024
6195172
Add nvtx tags for charges and thread charge calculation
mewall Jul 24, 2024
2c72a40
Add nvtx tag methods to gpmdk
mewall Jul 24, 2024
2c7afd7
Debug cray build and work on omp offload
mewall Jul 25, 2024
0ae84e1
Working omp offload for gpmdcov_response
mewall Jul 29, 2024
468bca1
Bug fix
mewall Jul 30, 2024
07f9dcb
Allocate smaller array for work in kernel
mewall Jul 30, 2024
5edf99b
Code decorations to investigate MPI imbalance
mewall Aug 6, 2024
b941d17
Build updates
mewall Aug 7, 2024
77f9065
Beginning update of graph MPI
mewall Aug 7, 2024
ee06263
Preliminary low-communication graph update introduced. Not tested.
mewall Aug 8, 2024
df07623
Working subgraph graph update method with less MPI communication
mewall Aug 8, 2024
a4a61d9
Added new script
Aug 8, 2024
6f08dcd
added bch dslp script
Aug 8, 2024
769d607
Attempt to fix the hackathon branch
mewall Aug 12, 2024
6a2cb5f
Eliminate debug output of FORCESS
mewall Aug 12, 2024
6722bda
Update bml submodule
mewall Aug 13, 2024
87cf0c8
Update bml submodule and move build scripts to scripts/ dir
mewall Aug 13, 2024
63828d5
breaking lines
Aug 22, 2024
52e7fd6
fixing gpmd.py
Aug 22, 2024
23c3af7
Add more nvtx tags
mewall Aug 23, 2024
69d937a
Added err_var
Aug 26, 2024
a4dc3a2
Workaround for Cray matmul bug in gpmdk
mewall Sep 6, 2024
26f34d8
Report max time and rank for dH+dS
mewall Sep 9, 2024
e2b232c
Fix kernel bug. Better matmul fix.
mewall Sep 9, 2024
fde467f
Clean up comments and begin working on offload of get_dH_or_dS_vect
mewall Sep 10, 2024
59a45c5
Added protections against double alloc
Sep 11, 2024
49154ca
Prepare for OMP offload optimization
mewall Sep 12, 2024
9fff9ab
Fix bug in graph update
mewall Sep 13, 2024
0315f8c
Use MAGMA pointer in offload response kernel
mewall Sep 17, 2024
9b5b4bd
Working on openACC
mewall Oct 29, 2024
829eb87
Working offload of response using magma pointer
mewall Oct 31, 2024
578dec1
fixing compilation bug gfortran
Nov 6, 2024
bdc7776
Added voltage option
Nov 21, 2024
7779af8
Added dos
Nov 22, 2024
c5dc023
changes for coarse MD
Dec 4, 2024
7b2ece1
Update bml submodule
mewall Dec 15, 2024
bfd58e4
Added formatting statements to prg_system_mod XYZ trajectory writing …
rcorrigan Jan 8, 2025
8a35116
Opts for nvidia build
mewall Nov 18, 2024
89f9eaa
NVIDIA opt get_hsmat
mewall Nov 19, 2024
305dbc9
NVIDIA opt hsderivative
mewall Nov 20, 2024
cb20d3d
OpenACC accelerated nonorthocoul
mewall Nov 23, 2024
33cf90a
Add variables to select MD steps for nsys profiling
mewall Jan 14, 2025
0933417
Disable bisection in gpmd. Nonorthocoul opts.
mewall Jan 14, 2025
03ef662
Fix merge error
mewall Jan 23, 2025
aa9f795
More NTVX tags
mewall Jan 31, 2025
06f07f5
dH optimization
mewall Jan 31, 2025
dc22aa2
Bug fix
mewall Jan 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[submodule "bml"]
path = bml
url = https://github.com/lanl/bml
branch = hackathon
1 change: 1 addition & 0 deletions bml
Submodule bml added at 2e6f60
2 changes: 1 addition & 1 deletion examples/gploop/gploop.F90
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ program gploop
!> Symmetrize and Threshold the Matrix
call bml_zero_matrix(lt%bml_type,bml_element_real,dp,norb,norb,copy_g_bml)
call bml_threshold(g_bml, gsp2%gthreshold)
call bml_transpose(g_bml, copy_g_bml)
call bml_transpose_new(g_bml, copy_g_bml)
call bml_add_deprecated(0.5_dp,g_bml,0.5_dp,copy_g_bml,0.0_dp)
call bml_threshold(g_bml, gsp2%gthreshold)
call bml_deallocate(copy_g_bml)
Expand Down
2 changes: 1 addition & 1 deletion examples/gpmd/gpmd.F90
Original file line number Diff line number Diff line change
Expand Up @@ -1089,7 +1089,7 @@ subroutine gpmd_graphpart
!> Symmetrize and Threshold the Matrix
call bml_zero_matrix(lt%bml_type,bml_element_real,dp,norb,norb,copy_g_bml)
call bml_threshold(g_bml, gsp2%gthreshold)
call bml_transpose(g_bml, copy_g_bml)
call bml_transpose_new(g_bml, copy_g_bml)
call bml_add_deprecated(0.5_dp,g_bml,0.5_dp,copy_g_bml,0.0_dp)
call bml_threshold(g_bml, gsp2%gthreshold)
call bml_deallocate(copy_g_bml)
Expand Down
2 changes: 1 addition & 1 deletion examples/gpmdcov/gpmdcov.F90
Original file line number Diff line number Diff line change
Expand Up @@ -1295,7 +1295,7 @@ subroutine gpmd_graphpart
! call bml_write_matrix(g_bml,"g_bml_bef")
! call bml_zero_matrix(gsp2%bml_type,bml_element_real,kind(1.0),sy%nats,mdim,copy_g_bml)
! call bml_threshold(g_bml, gsp2%gthreshold)
! call bml_transpose(g_bml, copy_g_bml)
! call bml_transpose_new(g_bml, copy_g_bml)
! call bml_add_deprecated(0.5_dp,g_bml,0.5_dp,copy_g_bml,0.0_dp)
! call bml_threshold(g_bml, gsp2%gthreshold)
! call bml_deallocate(copy_g_bml)
Expand Down
97 changes: 97 additions & 0 deletions examples/gpmdk/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
Graph-partition Quantum based Molecular Dynamics with Kernel (GPMDK)
=======================================================

About
********
The following research code is based on techniques that were recently published in `J. Chem. Phys. 158, 074108 (2023) <https://pubs.aip.org/aip/jcp/article/158/7/074108/2877017/Graph-based-quantum-response-theory-and-shadow>`_.
Briefly, the graph-based electronic structure theory is applied by using the atomic graph and Quantum Molecular Dynamics is performed on a completely memory distributed way. Moreover, this code make use of two LANL developed libraries, namely `PROGRESS <https://qmd-progress.readthedocs.io/en/latest/>`_ and `BML <https://basic-matrix-library.readthedocs.io/en/stable/>`_ which need to be properly installed in order to build the GPMD code. More about this libraries can be found in the following references `arXiv:2401.13772 <https://arxiv.org/abs/2401.13772>`_ and `J.Supercomput. 74 (11): 6201–19. <https://link.springer.com/article/10.1007/s11227-018-2533-0>`_.


Getting started
******************
We assume that you have access to the LANL Institutional Computing (IC) as well as access to Darwin. In order to get a Darwin account, please visit the following
`lik <https://int.lanl.gov/org/ddste/aldsc/ccs/applied-computer-science-ccs-7/access-request-process.shtml>`_.
For Institutional computing and getting access to Chicoma, please, ask the project manager (whoever got
time to run on IC machines by submitting a proposal) to include you in the specific
project. Then verify that you are properly included in `here <https://hpcaccounts.lanl.gov/projects/request/xd_g_turq>`_.

This material can be cloned or downloaded from the PROGRESS github repository (`github link <https://github.com/lanl/qmd-progress.git>`_).
The repository can be cloned as follows:

.. code-block:: bash

git clone https://github.com/lanl/qmd-progress.git

Look in the examples/gpmdk subdirectory.

Further build instructions for gpmdk to be written

**Building required packages METIS and Magma**

Download METIS (latest tested version: 5.1.0):
http://glaros.dtc.umn.edu/gkhome/metis/metis/download
SCP and unpack using:
.. code-block:: bash
scp metis-5.1.0.tar.gz username@ch-fe1:.
tar -xvf metis-5.1.0.tar.gz

Build METIS:
// Move into METIS directory
.. code-block:: bash
cd metis-5.1.0/
// -cc is the C compiler to use
//-prefix is where to build the metis libraries and can be changed
.. code-block:: bash
make config -shared=1 -cc=cc -prefix=~/local
make install

Download Magma (currently using version 2.5.4) from MAGMA website:
https://icl.utk.edu/magma/downloads/

Secure copy magma tar ball to chicoma:
// Replace “username” with your user name
.. code-block:: bash
scp magma-2.5.4.tar.gz username@ch-fe1:.

Untar and unzip the tarball and rename the directory:
.. code-block:: bash
tar -xvf magma-2.5.4.tar.gz
mv magma-2.5.4/ magma

Build magma using the “build_magma.sh” script provided in gpmd/chicomaGPU:
// Start outside of the magma directory then type the following command:
.. code-block:: bash
bash build_magma.sh

**Building BML and QMD-PROGRESS**

Clone into the BML repository:
.. code-block:: bash
git clone https://github.com/lanl/bml.git
Build BML by running the build_bml.sh script provided in gpmd/chicomaGPU in the directory immediately above bml/ :
.. code-block:: bash
bash build_bml.sh
BML requires a BLAS library (basic linear algebra subprograms) which should be available via the intel-mkl module. If you run into issues with BLAS not being found, check to make sure the intel-mkl module is loaded correctly (this should have been done by the setup-envs.sh script). You can check which modules are loaded using the following command:
.. code-block:: bash
module list
Clone into the QMD-PROGRESS repository:
.. code-block:: bash
git clone https://github.com/lanl/qmd-progress.git
Build QMD-PROGRESS by running the build_progress.sh script provided in gpmd/chicomaGPU in the directory immediately above qmd-progress:
// ** Note: QMD-PROGRESS requires BML, so be sure to build BML first
.. code-block:: bash
bash build_progress.sh

**Building GPMDK**

Build GPMDK after building all other dependencies (METIS, Magma, BML, and QMD-PROGRESS) by running the build_cmake.sh script provided with GPMDK :
.. code-block:: bash
bash build_cmake.sh

Running GPMDK
*******************

Go into the ``run`` folder. There is an input file ``input.in`` providing most of the variables that are needed by the code. There
are also several ``pdb`` files with chemical systems that one can use as examples.


10 changes: 10 additions & 0 deletions examples/gpmdk/build_chicoma_hackathon.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
WORKDIR=/usr/projects/icapt/mewall/packages/gpmd
rm -rf build_hackathon
mkdir build_hackathon
cd build_hackathon
cmake -DCMAKE_Fortran_COMPILER="ftn" -DPROGRESS_MPI="yes -DLIB="no" -DGPMDK_NVTX="yes"" \
-DEXTRA_FCFLAGS="-g -O2 -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -lnvToolsExt" \
-DCMAKE_PREFIX_PATH="$WORKDIR/qmd-progress/install_hackathon/;$WORKDIR/qmd-progress/bml/install_hackathon;$WORKDIR/metis-5.1.0/" ../src/
#-DEXTRA_FCFLAGS="-g -O2 -DHPCTOOLKIT_PROFILE -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -L$HPCTOOLKIT/lib/hpctoolkit -lhpctoolkit" \
make

9 changes: 9 additions & 0 deletions examples/gpmdk/build_gpmd_nvhpc.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
WORKDIR=/usr/projects/icapt/mewall/venado/packages
rm -rf build_nvhpc
mkdir build_nvhpc
cd build_nvhpc
cmake -DCMAKE_Fortran_COMPILER="mpifort" -DPROGRESS_MPI="yes" -DGPMDK_NVTX="yes" \
-DEXTRA_FCFLAGS="-g -mp -O2 -L${NVHPC_ROOT}/cuda/lib64 -lnvToolsExt" \
-DCMAKE_PREFIX_PATH="$WORKDIR/qmd-progress/install_nvhpc/;$WORKDIR/qmd-progress/bml/install_nvhpc;$WORKDIR/metis-5.1.0/install" ../src/
make

10 changes: 10 additions & 0 deletions examples/gpmdk/build_venado_hackathon.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
WORKDIR=/usr/projects/icapt/mewall/venado/packages
rm -rf build_hackathon
mkdir build_hackathon
cd build_hackathon
cmake -DCMAKE_Fortran_COMPILER="ftn" -DPROGRESS_MPI="yes" -DLIB="no" -DGPMDK_NVTX="yes" \
-DEXTRA_FCFLAGS="-O3 -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -lnvToolsExt -ftree-vectorizer-verbose=2 -mcpu=neoverse-v2" \
-DCMAKE_PREFIX_PATH="$WORKDIR/qmd-progress/install_hackathon/;$WORKDIR/qmd-progress/bml/install_hackathon;$WORKDIR/metis-5.1.0/install" ../src/
#-DEXTRA_FCFLAGS="-g -O2 -DHPCTOOLKIT_PROFILE -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -L$HPCTOOLKIT/lib/hpctoolkit -lhpctoolkit" \
make -j1

10 changes: 10 additions & 0 deletions examples/gpmdk/build_venado_hackathon_nvtx.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
WORKDIR=/usr/projects/icapt/mewall/venado/packages
rm -rf build_hackathon
mkdir build_hackathon
cd build_hackathon
cmake -DCMAKE_Fortran_COMPILER="ftn" -DPROGRESS_MPI="yes" -DLIB="no" -DGPMDK_NVTX="yes" \
-DEXTRA_FCFLAGS="-O3 -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -lnvToolsExt -ftree-vectorizer-verbose=2 -mcpu=neoverse-v2" \
-DCMAKE_PREFIX_PATH="$WORKDIR/qmd-progress/install_hackathon/;$WORKDIR/qmd-progress/bml/install_hackathon;$WORKDIR/metis-5.1.0/install" ../src/
#-DEXTRA_FCFLAGS="-g -O2 -DHPCTOOLKIT_PROFILE -Wall -Wunused -fopenmp -ffree-line-length-none -ffpe-trap=invalid,overflow,zero -L$HPCTOOLKIT/lib/hpctoolkit -lhpctoolkit" \
make -j1

19 changes: 19 additions & 0 deletions examples/gpmdk/docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Makefile for the GPMD documentation

DIR=$(shell pwd)

all:
(make doxy)

clean:
rm -f *.backup

doxy:
(doxygen ./source-doxygen/gpmd_doxyfile)
(cd ./build-doxygen/latex ; make ; cp refman.pdf ../../gpmd-doxygen.pdf)

sphinx:
(cp ../README.rst ./source-sphinx/README.rst)
(cd ./source-sphinx/; make latex)
(cd ./build-sphinx/latex/; make; cp gpmd.pdf ../../gpmd-sphinx.pdf)

Loading