Skip to content

Commit

Permalink
first commit 🎉
Browse files Browse the repository at this point in the history
  • Loading branch information
simonbatzner committed Mar 16, 2021
0 parents commit 6d378ea
Show file tree
Hide file tree
Showing 64 changed files with 10,582 additions and 0 deletions.
159 changes: 159 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
.idea/
.vscode/
*__pycache__*
*wandb*
*.npz
results/
experiments/
nequip.egg-info/
saved_models/
cov*.xml
.coverage
analysis/
*.icloud
benchmark_data/
log.*
saved_model/
*.ipynb_checkpoints/
tutorial/results/
tutorial/tutorial_data/
.DS_store/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2021 The President and Fellows of Harvard College

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
91 changes: 91 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# NequIP

NequIP is an open-source deep learning package for learning interatomic potentials using E(3)-equivariant convolutions.

### Requirements

* Python, v3.8
* PyTorch, v1.8
* Numpy, v1.19.5
* Scipy, v1.6.0
* ASE, v3.20.1

In particular, please be sure to install Python 3.8 and Pytorch 1.8.

### Installation

* Install [PyTorch Geometric](https://github.com/rusty1s/pytorch_geometric), make sure to install this with your correct version of CUDA/CPU.
* Install [e3nn](https://github.com/e3nn/e3nn) - it is important to install the ```main``` branch and not the ```master```

```
pip install git+https://github.com/e3nn/e3nn.git
```

* We use [Weights&Biases](https://wandb.ai) to keep track of experiments. This is not a strict requirement, you can use our software without this, but it may make your life easier. If you want to use it, create an account [here](https://wandb.ai) and install it:

```
pip install wandb
```

* Install NequIP

```
pip install git+https://github.com/mir-group/nequip.git
```

### Installation Issues

We recommend running the tests using ```pytest```:

```
pip install pytest
pytest ./tests
```

One some platforms, the installation may complain about the scikit learn installation. If that's the case, specifically install the following scikit-learn version:

```
pip install -U scikit-learn==0.23.0
```

That should fix it.

### Tutorial

The best way to learn how to use NequIP is through the tutorial notebook in ```tutorials```.

### Training a network

To train a network, all you need to is run train.py with a config file that describes your data set and network, for example:

```
python scripts/train.py configs/example.yaml
```

### References

The theory behind NequIP is described in our preprint [1]. NequIP's backend builds on e3nn, a general framework for building E(3)-equivariant neural networks [2].

[1] https://arxiv.org/abs/2101.03164
[2] https://github.com/e3nn/e3nn

### Authors

NequIP is being developed by:

- Simon Batzner
- Anders Johansson
- Albert Musaelian
- Lixin Sun
- Mario Geiger
- Tess Smidt

under the guidance of Boris Kozinsky at Harvard.


### Citing

If you use this repository in your work, plase consider citing us with the following pre-print:

[1] https://arxiv.org/abs/2101.03164

124 changes: 124 additions & 0 deletions configs/example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# general

# Two folders will be used during the training: 'root'/process and 'root'/'project'
# project contains logfiles and saved models
# process contains processed data sets
# if 'root'/'project' exists, 'root'/'project'_'year'-'month'-'day'-'hour'-'min'-'s' will be used instead.
root: results/aspirin
project: maximal
seed: 0 # random number seed for numpy and torch
restart: false # set True for a restarted run
append: false # set True if a restarted run should append to the previous log file

# network
compile_model: False # whether to compile the constructed model to TorchScript
num_basis: 8 # number of basis functions
r_max: 4.0 # cutoff radius
irreps_edge_sh: 0e + 1o + 2e # irreps of the spherical harmonics used for edges. If a single integer, indicates the full SH up to L_max=that_integer
conv_to_output_hidden_irreps_out: 16x0e # irreps used in hidden layer of output block
feature_irreps_hidden: 16x0o + 16x0e + 16x1o + 16x1e + 16x2o + 16x2e # irreps used for hidden features, here we go up to lmax=2, with even and odd parities
BesselBasis_trainable: true # set true to train the bessel weights
nonlinearity_type: gate # may be 'gate' or 'norm', 'gate' is recommended
num_layers: 6 # number of interaction blocks, we found 5-6 to work best
resnet: false # set True to make interaction block a resnet-style update
PolynomialCutoff_p: 6 # p-value used in polynomial cutoff function
convolution_kwargs: #
invariant_layers: 1 # number of radial layers, we found it important to keep this small, 1 or 2
invariant_neurons: 8 # number of hidden neurons in radial function, again keep this small for MD applications, 8 - 32, smaller is faster
avg_num_neighbors: null # number of neighbors to divide by, None => no normalization.
use_sc: true # use self-connection or not, usually gives big improvement

# data set
# the keys used need to be stated at least once in key_mapping, npz_fixed_field_keys or npz_keys
# key_mapping is used to map the key in the npz file to the NequIP default values (see data/_key.py)
# all arrays are expected to have the shape of (nframe, natom, ?) except the fixed fields
dataset: npz # type of data set, can be npz or ase
dataset_file_name: ./benchmark_data/aspirin_dft.npz # path to data set file
key_mapping:
z: atomic_numbers # atomic species, integers
E: total_energy # total potential eneriges to train to
F: forces # atomic forces to train to
R: pos # raw atomic positions
npz_fixed_field_keys: # fields that are repeated across different examples
- atomic_numbers

# As an alternative option to npz, you can also pass data ase ASE Atoms-bojects
# dataset: ase
# dataset_file_name: xxx.xyz # need to be a format accepted by ase.io.read
# ase_args: # any arguments needed by ase.io.read
# format: extxyz

# logging
wandb: true # we recommend using wandb for logging
verbose: info # the same as python logging, e.g. warning, info, debug, error. case insensitive
log_batch_freq: 1 # batch frequency, how often to print training errors withinin the same epoch
log_epoch_freq: 1 # epoch frequency, how often to print and save the model

# training
n_train: 975 # number of training data
n_val: 25 # number of validation data
learning_rate: 0.01 # learning rate, we found 0.01 to work best - this is often one of the most important hyperparameters to tune
batch_size: 5 # batch size
max_epochs: 1000000 # stop training after _ number of epochs
train_val_split: random # can be random or sequential. if sequential, first n_train elements are training, next n_val are val, else random
shuffle: true # If true, the data loader will shuffle the data
metrics_key: loss # metrics used for scheduling and saving best model. Options: loss, or anything appear in the
# validation batch step header, such as f_mae, f_rmse, e_mae, e_rmse

# loss function
loss_coeffs: # different weights to use in a weighted loss functions
forces: 1.0 # for MD applications, we recommed a force weight of 1
total_energy: 0.0 # and an energy weight of 0., this usually gives the best errors in the forces

# # default loss function is MSELoss, the name has to be exactly the same as those in torch.nn.
# the only supprted targets are forces and total_energy

# here are some example of more ways to declare different types of loss functions, depending on your application:
# loss_coeffs:
# total_energy: MSELoss
#
# loss_coeffs:
# total_energy:
# - 3.0
# - MSELoss
#
# loss_coeffs:
# forces:
# - 1.0
# - PerSpeciesL1Loss
#
# loss_coeffs: total_energy
#
# loss_coeffs:
# total_energy:
# - 3.0
# - L1Loss
# forces: 1.0

# optional keys
# if true and weights_forces defined in the dataset, the loss function will be weighted
atomic_weight_on: false

# optimizer, may be any optimizer defined in torch.optim
# the name `optimizer_name`is case sensitive
optimizer_name: Adam # default optimizer is Adam in the amsgrad mode
optimzer_params: # any params taken by the torch.optim.xxx constructor
amsgrad: true
betas: !!python/tuple
- 0.9
- 0.999
eps: 1.0e-08
weight_decay: 0

# lr scheduler, currently only supports the two options listed below, if you need more please file an issue
# first open, consine annealing with warm restart
lr_scheduler_name: CosineAnnealingWarmRestarts
lr_scheduler_params:
T_0: 10000
T_mult: 2
eta_min: 0
last_epoch: -1

# alternative option, on-plateau
# lr_scheduler_name: ReduceLROnPlateau
# lr_patience: 1000
Loading

0 comments on commit 6d378ea

Please sign in to comment.