Skip to content

Using SCOOP for distributed parallel processing #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
3a93b3d
Create generateThermoDataForListOfSpecies() method in CoreEdgeReactio…
rwest Nov 28, 2012
6988ab2
An attempt at using SCOOP to parallelize the thermo data estimation.
rwest Nov 29, 2012
5023117
Pickling and unpickle database for scoop.
rwest Apr 3, 2013
7b71119
Added scoop example input file and LSF submission file.
rwest Mar 27, 2013
7fca647
Added an rmgpy.utilities module with path_checksum() function.
rwest Apr 5, 2013
6800963
Cache the RMG Database in a pickle and a hash, to speed up reloading it.
rwest Apr 5, 2013
6c12a79
Use scoop.shared module to share database cache location and hash.
rwest Apr 5, 2013
211231c
Merged master into scoop, and resolved the conflict.
keceli Aug 14, 2013
507a875
First attempt to add QM capability with scoop.
keceli Aug 28, 2013
997e8d8
Some explanation on how to install and use scoop mainly on Pharos
keceli Aug 29, 2013
56ca670
rmgpy/qm/mopac.py: Wait for 1 sec for the buffer to write to disc. Te…
keceli Oct 4, 2013
5d060eb
gaussian.py: added sleep
keceli Oct 11, 2013
10b8b51
Merge remote-tracking branch 'upstream/master' into scoop2
keceli Oct 11, 2013
7c68b07
gaussian.py: required for gaussian to work on pharos
keceli Oct 11, 2013
14e2e17
Merge remote-tracking branch 'upstream/master'
keceli Oct 11, 2013
384db4e
attempt to run RMG without thermolibrary
keceli Oct 12, 2013
79e6713
Parallelizing thermoEstimator.py.
keceli Oct 22, 2013
13af2d9
PM6 and PM7 options are added for MOPAC. Usage software=MOPACPM6, etc.
keceli Oct 31, 2013
f3b6020
Revert "gaussian.py: required for gaussian to work on pharos"
keceli Nov 1, 2013
2231026
Trying to avoid sleep solution for AttributeError in cclib object
keceli Nov 1, 2013
9566913
resolving conflicts to merge with master
keceli Nov 5, 2013
f3e0bdf
Reverting a mistake I did during the merge with master
keceli Nov 5, 2013
f87faf9
Thermoestimator generates output even if the job does not finish
keceli Nov 8, 2013
458c5e7
Added profiling for thermoEstimator
keceli Nov 8, 2013
6ec8ae7
Added makeProfileGraph as a script to be able to modify profile graph…
keceli Nov 9, 2013
5dd2c17
Moved generateQMData to higher QMMolecule class along with minor modi…
keceli Nov 12, 2013
1901b8a
scoop jobs can now be run through SGE
keceli Nov 13, 2013
d3a2926
Added examples for SCOOP and some minor improvements
keceli Nov 14, 2013
e08ce78
Revert "attempt to run RMG without thermolibrary"
keceli Nov 15, 2013
f716b5b
Fixed a comment and revert an unnecessary change.
keceli Nov 15, 2013
dad02b4
Fixing some problems with makeProfileGraph
keceli Nov 16, 2013
4fadd44
Improved argument parsing and increased chunk size
keceli Nov 22, 2013
fcb5c20
Load only thermo libraries
keceli Nov 24, 2013
f9dea82
Allows RMG to continue when there is a duplicate species
keceli Nov 25, 2013
71d948d
Added memory usage information in thermoEstimator
keceli Nov 25, 2013
8e63e0f
Changed the scope of shared variable qmValue.
keceli Nov 25, 2013
7cc203c
Fixed the problem in naming the Thermo Library
keceli Nov 25, 2013
fa87afe
Improved argument parsing in thermoEstimator
keceli Dec 2, 2013
210ea35
Fixed the problem with linear molecules
keceli Dec 18, 2013
08f210f
Pulled master
keceli Dec 18, 2013
0c31eb0
Limit lone pair drawing to Nitrogen atoms only.
connie Dec 19, 2013
67d1cf6
Draw lone electron pairs only for nitrogen containing species
bbuesser Dec 19, 2013
5048034
Make transport writing compatible with Python 2.6
rwest Dec 20, 2013
b2f809f
Do not load salvation groups by default.
keceli Dec 21, 2013
af7214f
Merge pull request #1 from keceli/scoop
rwest Dec 21, 2013
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions README2SCOOP.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
******************************************************
SCOOP enabled RMG-Py
******************************************************

RMG-Py can be run in parallel (only for the thermochemical parameter
estimation part) using SCOOP module.
More info on SCOOP: http://code.google.com/p/scoop/

Running RMG-Py in parallel:

python -m scoop.__main__ -n 8 $RMGpy/rmddg.py input.py > RMG.sdout.log &

-n 8 specifies that you will have 8 workers.
Set it based on the available number of processors.
For job submission scripts check examples/rmg/scoop.

Installing SCOOP:

You need the development version of SCOOP (tagged with 0.7RC2).
Download link: http://scoop.googlecode.com/archive/0.7RC2.zip

159 changes: 159 additions & 0 deletions examples/rmg/scoop/input.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Data sources
database(
thermoLibraries = ['primaryThermoLibrary','DFT_QCI_thermo','GRI-Mech3.0'],
reactionLibraries = [('Methylformate',False),('Glarborg/highP',False)],
seedMechanisms = ['Glarborg/C2'],
kineticsDepositories = ['training'],
kineticsFamilies = ['!Intra_Disproportionation'],
kineticsEstimator = 'rate rules',
)

# List of species
species(
label='Mfmt',
reactive=True,
structure=SMILES("COC=O"),
)
species(
label='O2',
reactive=True,
structure=SMILES("[O][O]"),
)
species(
label='C2H',
reactive=True,
structure=SMILES("C#[C]"),
)
species(
label='CH',
reactive=True,
structure=adjacencyList(
"""
1 C 3 {2,S}
2 H 0 {1,S}
"""),
)
species(
label='H2O',
reactive=True,
structure=SMILES("O"),
)
species(
label='H2',
reactive=True,
structure=SMILES("[H][H]"),
)
species(
label='CO',
reactive=True,
structure=SMILES("[C]=O"),
)
species(
label='CO2',
reactive=True,
structure=SMILES("C(=O)=O"),
)
species(
label='CH4',
reactive=True,
structure=SMILES("C"),
)
species(
label='CH3',
reactive=True,
structure=SMILES("[CH3]"),
)
species(
label='CH3OH',
reactive=True,
structure=SMILES("CO"),
)
species(
label='C2H4',
reactive=True,
structure=SMILES("C=C"),
)
species(
label='C2H2',
reactive=True,
structure=SMILES("C#C"),
)
species(
label='CH2O',
reactive=True,
structure=SMILES("C=O"),
)
species(
label='CH3CHO',
reactive=True,
structure=SMILES("CC=O"),
)


# Bath gas
species(
label='Ar',
reactive=False,
structure=InChI("InChI=1S/Ar"),
)

# Reaction systems
simpleReactor(
temperature=(650,'K'),
pressure=(1.0,'bar'),
initialMoleFractions={
"Mfmt": 0.01,
"O2": 0.02,
"Ar": 0.08,
},
terminationTime=(0.5,'s'),
)
simpleReactor(
temperature=(1350,'K'),
pressure=(3.0,'bar'),
initialMoleFractions={
"Mfmt": 0.01,
"O2": 0.02,
"Ar": 0.97,
},
terminationTime=(0.5,'s'),
)
simpleReactor(
temperature=(1950,'K'),
pressure=(10.0,'bar'),
initialMoleFractions={
"Mfmt": 0.01,
"O2": 0.02,
"Ar": 0.97,
},
terminationTime=(0.5,'s'),
)

simulator(
atol=1e-22,
rtol=1e-8,
)

model(
toleranceKeepInEdge=0.0,
toleranceMoveToCore=0.0005,
toleranceInterruptSimulation=1.0,
maximumEdgeSpecies=100000
)

pressureDependence(
method='modified strong collision', # 'reservoir state'
maximumGrainSize=(1.0,'kcal/mol'),
minimumNumberOfGrains=200,
temperatures=(290,3500,'K',8),
pressures=(0.02,100,'bar',5),
interpolation=('Chebyshev', 6, 4),
)

options(
units='si',
saveRestartPeriod=None,
drawMolecules=False,
generatePlots=False,
saveConcentrationProfiles=True,
)
59 changes: 59 additions & 0 deletions examples/rmg/scoop/lsf.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/bin/sh
#BSUB -o RMG.out
#BSUB -J RMGPyScoop
#BSUB -n 8
#BSUB -e error_log
#BSUB -q medium_priority

# This is a job submission file for a LSF queuing system to run
# the SCOOP-enabled parallel version of RMG-Py across 8 CPUs on
# a number of different compute nodes on a (potentially heterogeneous) cluster.

source ~/.bash_profile

LAMHOST_FILE=hosts

# start a new host file from scratch
rm -f $LAMHOST_FILE
touch $LAMHOST_FILE
# echo "# LAMMPI host file created by LSF on `date`" >> $LAMHOST_FILE
# check if we were able to start writing the conf file
if [ -f $LAMHOST_FILE ]; then
:
else
echo "$0: can't create $LAMHOST_FILE"
exit 1
fi
HOST=""
NUM_PROC=""
FLAG=""
TOTAL_CPUS=0
for TOKEN in $LSB_MCPU_HOSTS
do
if [ -z "$FLAG" ]; then
HOST="$TOKEN"
FLAG="0"
else
NUM_PROC="$TOKEN"
TOTAL_CPUS=`expr $TOTAL_CPUS + $NUM_PROC`
FLAG="1"
fi
if [ "$FLAG" = "1" ]; then
_x=0
while [ $_x -lt $NUM_PROC ]
do
echo "$HOST" >>$LAMHOST_FILE
_x=`expr $_x + 1`
done
# get ready for the next host
FLAG=""
HOST=""
NUM_PROC=""
fi
done
# last thing added to LAMHOST_FILE
#echo "# end of LAMHOST file" >> $LAMHOST_FILE
echo "Your lamboot hostfile looks like:"
cat $LAMHOST_FILE

python -m scoop -vv --hostfile $LAMHOST_FILE $RMGpy/rmg.py input.py > RMG.stdout.log
2 changes: 2 additions & 0 deletions examples/rmg/scoop/prolog.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/bash
source ~/.bash_profile
25 changes: 25 additions & 0 deletions examples/rmg/scoop/sge.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash
#####################i################################################
# This is a job submission file for a SGE queuing system to run
# the SCOOP-enabled parallel version of RMG-Py across 48 CPUs on
# a single node.
#
# Define RMGPy as the path to rmg.py in your ~/.bash_profile
# NSLOTS is an SGE env. variable for total number of CPUs.
# prolog.sh is a script used by SCOOP to pass env. variables
#
# You can run the jobs on different nodes as well, but it is not
# recommended since you might have problems with SGE job termination.
# Type `qconf -spl` to see available parallel environments and modify
# the last #$ line if you really want to run it on many nodes.
#####################i################################################
#$ -S /bin/bash
#$ -cwd
#$ -notify
#$ -o job.log -j y
#$ -N RMGscoop
#$ -l normal
#$ -l h_rt=09:05:00
#$ -pe singlenode 48
source ~/.bash_profile
python -m scoop.__main__ --tunnel --prolog $RMGpy/examples/rmg/scoop/prolog.sh -n $NSLOTS $RMGpy/rmg.py input.py > std.out
Loading