diff --git a/docs/api/callbacks.rst b/docs/api/callbacks.rst new file mode 100644 index 00000000..01fb9739 --- /dev/null +++ b/docs/api/callbacks.rst @@ -0,0 +1,17 @@ +nequip.train.callbacks +###################### + + .. autoclass:: nequip.train.callbacks.NeMoExponentialMovingAverage + :members: + + .. autoclass:: nequip.train.callbacks.SoftAdapt + :members: + + .. autoclass:: nequip.train.callbacks.LossCoefficientScheduler + :members: + + .. autoclass:: nequip.train.callbacks.LossCoefficientMonitor + :members: + + .. autoclass:: nequip.train.callbacks.TestTimeXYZFileWriter + :members: diff --git a/docs/api/data.rst b/docs/api/data.rst index 2eae9c20..0ce11c53 100644 --- a/docs/api/data.rst +++ b/docs/api/data.rst @@ -1,72 +1,12 @@ -nequip.data (Fields, Modifiers, and Statistics) -=============================================== - -Data Fields -########### - -The NequIP infrastructure provides some ready-to-use data fields, such as ``total_energy``, ``forces``, ``stress``, etc. These are the names that should be referred to when using methods and classes in the NequIP package, such as the fields given to ``nequip.train.MetricsManager``. The data fields are broadly categorized (in a mutually exclusive manner) as graph (per-frame), node (per-atom), or edge fields (per-"bond"). - -.. autoclass:: nequip.data._GRAPH_FIELDS -.. autoclass:: nequip.data._NODE_FIELDS -.. autoclass:: nequip.data._EDGE_FIELDS - -There are additional categories used for the internal data processing in the NequIP infrastructure. - -.. autoclass:: nequip.data._LONG_FIELDS - -.. autoclass:: nequip.data._CARTESIAN_TENSOR_FIELDS - -Custom fields must be registered with the following field registration methods to be compatible with the internal logic of NequIP's data processing infrastructure. - -.. autofunction:: nequip.data.register_fields - -.. autofunction:: nequip.data.deregister_fields - - - -Data Modifiers -############## - -One can use modifiers to convert raw quantities from an ``AtomicDataDict`` into a form that is desired in the ``MetricsManager`` and ``DataStatisticsManager``. For example, - - - to extract a ``total_energy`` and make it a per-atom ``total_energy`` (``nequip.data.PerAtomModifier``), - - to extract and convert position and neighborlist information into edge lengths (``nequip.data.EdgeLengths``), or - - to extract and convert neighborlist information into the number of neighbors around each atom (``nequip.data.NumNeighbors``). - - - .. autoclass:: nequip.data.PerAtomModifier - :members: - - .. autoclass:: nequip.data.EdgeLengths - :members: - - .. autoclass:: nequip.data.NumNeighbors - :members: - - -Dataset Statistics -################## - - .. autoclass:: nequip.data.DataStatisticsManager - :members: - - .. autoclass:: nequip.data.Mean - :members: - - .. autoclass:: nequip.data.MeanAbsolute - :members: - - .. autoclass:: nequip.data.RootMeanSquare - :members: - - .. autoclass:: nequip.data.StandardDeviation - :members: - - .. autoclass:: nequip.data.Min - :members: - - .. autoclass:: nequip.data.Max - :members: - - .. autoclass:: nequip.data.Count - :members: \ No newline at end of file +nequip.data +=========== + + .. toctree:: + :maxdepth: 1 + + data_fields + datamodule + dataset + data_transforms + data_modifiers + data_stats diff --git a/docs/api/data_fields.rst b/docs/api/data_fields.rst new file mode 100644 index 00000000..f420b2c4 --- /dev/null +++ b/docs/api/data_fields.rst @@ -0,0 +1,20 @@ +Data Fields +########### + +The NequIP infrastructure provides some ready-to-use data fields, such as ``total_energy``, ``forces``, ``stress``, etc. These are the names that should be referred to when using methods and classes in the NequIP package, such as the fields given to ``nequip.data.DataStatisticsManager`` or ``nequip.train.MetricsManager``. The data fields are broadly categorized (in a mutually exclusive manner) as graph (per-frame), node (per-atom), or edge fields (per-"bond"). + +.. autodata:: nequip.data._GRAPH_FIELDS +.. autodata:: nequip.data._NODE_FIELDS +.. autodata:: nequip.data._EDGE_FIELDS + +There are additional categories used for the internal data processing in the NequIP infrastructure. + +.. autodata:: nequip.data._LONG_FIELDS + +.. autodata:: nequip.data._CARTESIAN_TENSOR_FIELDS + +Custom fields must be registered with the following field registration methods to be compatible with the internal logic of NequIP's data processing infrastructure. + +.. autofunction:: nequip.data.register_fields + +.. autofunction:: nequip.data.deregister_fields diff --git a/docs/api/data_modifiers.rst b/docs/api/data_modifiers.rst new file mode 100644 index 00000000..3bcf4b5f --- /dev/null +++ b/docs/api/data_modifiers.rst @@ -0,0 +1,18 @@ +Data Modifiers +############## + +One can use modifiers to convert raw quantities from an ``AtomicDataDict`` into a form that is desired in the ``MetricsManager`` and ``DataStatisticsManager``. For example, + + - to extract a ``total_energy`` and make it a per-atom ``total_energy`` (``nequip.data.PerAtomModifier``), + - to extract and convert position and neighborlist information into edge lengths (``nequip.data.EdgeLengths``), or + - to extract and convert neighborlist information into the number of neighbors around each atom (``nequip.data.NumNeighbors``). + + + .. autoclass:: nequip.data.PerAtomModifier + :members: + + .. autoclass:: nequip.data.EdgeLengths + :members: + + .. autoclass:: nequip.data.NumNeighbors + :members: diff --git a/docs/api/data_stats.rst b/docs/api/data_stats.rst new file mode 100644 index 00000000..2af528fb --- /dev/null +++ b/docs/api/data_stats.rst @@ -0,0 +1,26 @@ +Dataset Statistics +################## + + .. autoclass:: nequip.data.DataStatisticsManager + :members: + + .. autoclass:: nequip.data.Mean + :members: + + .. autoclass:: nequip.data.MeanAbsolute + :members: + + .. autoclass:: nequip.data.RootMeanSquare + :members: + + .. autoclass:: nequip.data.StandardDeviation + :members: + + .. autoclass:: nequip.data.Min + :members: + + .. autoclass:: nequip.data.Max + :members: + + .. autoclass:: nequip.data.Count + :members: \ No newline at end of file diff --git a/docs/api/data_transforms.rst b/docs/api/data_transforms.rst new file mode 100644 index 00000000..150db8b0 --- /dev/null +++ b/docs/api/data_transforms.rst @@ -0,0 +1,13 @@ +nequip.data.transforms +###################### + +Data transforms convert the raw data from the ``Dataset`` to include information necessary for the model to make predictions and perform training. For example, datasets do not usually come with neighborlists, so the ``NeighborListTransform`` is required to convert raw data that only contains positions and energy (and force) labels to additionally include a neighborlist necessary for the model to make predictions. + + .. autoclass:: nequip.data.transforms.ChemicalSpeciesToAtomTypeMapper + :members: + + .. autoclass:: nequip.data.transforms.NeighborListTransform + :members: + + .. autoclass:: nequip.data.transforms.VirialToStressTransform + :members: diff --git a/docs/api/datamodule.rst b/docs/api/datamodule.rst index 6a23ad4f..69422cee 100644 --- a/docs/api/datamodule.rst +++ b/docs/api/datamodule.rst @@ -1,8 +1,5 @@ -nequip.data (DataModules, Datasets, and Transforms) -=================================================== - -Data Modules -############ +nequip.data.datamodule +###################### ``nequip`` provides a general base ``DataModule`` class, ``NequIPDataModule``, @@ -30,39 +27,3 @@ All data modules should (and would) share the following features .. autoclass:: nequip.data.datamodule.NequIP3BPADataModule :members: - - -Datasets -######## - - .. autoclass:: nequip.data.dataset.AtomicDataset - :members: - - .. autoclass:: nequip.data.dataset.ASEDataset - :members: - - .. autoclass:: nequip.data.dataset.HDF5Dataset - :members: - - .. autoclass:: nequip.data.dataset.EMTTestDataset - :members: - - .. autoclass:: nequip.data.dataset.SubsetByRandomSlice - :members: - - .. autofunction:: nequip.data.dataset.RandomSplitAndIndexDataset - - -Transforms -########## - -Data transforms convert the raw data from the ``Dataset`` to include information necessary for the model to make predictions and perform training. For example, datasets do not usually come with neighborlists, so the ``NeighborListTransform`` is required to convert raw data that only contains positions and energy (and force) labels to additionally include a neighborlist necessary for the model to make predictions. - - .. autoclass:: nequip.data.transforms.ChemicalSpeciesToAtomTypeMapper - :members: - - .. autoclass:: nequip.data.transforms.NeighborListTransform - :members: - - .. autoclass:: nequip.data.transforms.VirialToStressTransform - :members: diff --git a/docs/api/dataset.rst b/docs/api/dataset.rst new file mode 100644 index 00000000..fb34a1f0 --- /dev/null +++ b/docs/api/dataset.rst @@ -0,0 +1,19 @@ +nequip.data.dataset +################### + + .. autoclass:: nequip.data.dataset.AtomicDataset + :members: + + .. autoclass:: nequip.data.dataset.ASEDataset + :members: + + .. autoclass:: nequip.data.dataset.HDF5Dataset + :members: + + .. autoclass:: nequip.data.dataset.EMTTestDataset + :members: + + .. autoclass:: nequip.data.dataset.SubsetByRandomSlice + :members: + + .. autofunction:: nequip.data.dataset.RandomSplitAndIndexDataset diff --git a/docs/api/lightning_module.rst b/docs/api/lightning_module.rst new file mode 100644 index 00000000..288a1475 --- /dev/null +++ b/docs/api/lightning_module.rst @@ -0,0 +1,5 @@ +NequIPLightningModule +##################### + + .. autoclass:: nequip.train.NequIPLightningModule + :members: \ No newline at end of file diff --git a/docs/api/metrics.rst b/docs/api/metrics.rst new file mode 100644 index 00000000..98a2c399 --- /dev/null +++ b/docs/api/metrics.rst @@ -0,0 +1,18 @@ +MetricsManager +############## + + .. autoclass:: nequip.train.MetricsManager + :members: + + +Error Metrics +############# + + .. autoclass:: nequip.train.MeanSquaredError + :members: + + .. autoclass:: nequip.train.RootMeanSquaredError + :members: + + .. autoclass:: nequip.train.MeanAbsoluteError + :members: diff --git a/docs/api/nequip.rst b/docs/api/nequip.rst index 8e5d0739..457cfef6 100644 --- a/docs/api/nequip.rst +++ b/docs/api/nequip.rst @@ -4,5 +4,4 @@ Python API .. toctree:: data - datamodule train diff --git a/docs/api/train.rst b/docs/api/train.rst index 0156a220..231b9ce6 100644 --- a/docs/api/train.rst +++ b/docs/api/train.rst @@ -1,46 +1,9 @@ nequip.train ============ -NequIPLightningModule -##################### + .. toctree:: + :maxdepth: 1 - .. autoclass:: nequip.train.NequIPLightningModule - :members: - - -MetricsManager -############## - - .. autoclass:: nequip.train.MetricsManager - :members: - - -Metrics -####### - - .. autoclass:: nequip.train.MeanSquaredError - :members: - - .. autoclass:: nequip.train.RootMeanSquaredError - :members: - - .. autoclass:: nequip.train.MeanAbsoluteError - :members: - -Callbacks -######### - - .. autoclass:: nequip.train.callbacks.NeMoExponentialMovingAverage - :members: - - .. autoclass:: nequip.train.callbacks.SoftAdapt - :members: - - .. autoclass:: nequip.train.callbacks.LossCoefficientScheduler - :members: - - .. autoclass:: nequip.train.callbacks.LossCoefficientMonitor - :members: - - .. autoclass:: nequip.train.callbacks.TestTimeXYZFileWriter - :members: + lightning_module + metrics + callbacks diff --git a/docs/conf.py b/docs/conf.py index e600b2df..3474d053 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -64,5 +64,12 @@ } +def process_docstring(app, what, name, obj, options, lines): + """For pretty printing sets and dictionaries of data fields.""" + if isinstance(obj, set) or isinstance(obj, dict): + lines.clear() # Clear existing lines to prevent repetition + + def setup(app): app.add_css_file("custom.css") + app.connect("autodoc-process-docstring", process_docstring) diff --git a/docs/guide/config.md b/docs/guide/config.md index 13eb4d19..92419abb 100644 --- a/docs/guide/config.md +++ b/docs/guide/config.md @@ -19,18 +19,24 @@ run: [val, test, train, val, test] ## `data` -TODO: explain `DataModule`s and link to docs +`data` is the `DataModule` object to be used. Users are directed to the [API page](../api/datamodule.rst) of `nequip.data.datamodule` for the `nequip` supported `DataModule` classes. Custom datamodules that subclass from `nequip.data.datamodule.NequIPDataModule` can also be used. + ## `trainer` The `trainer` is meant to instantiate a `lightning.Trainer` object. To understand how to configure it, users are directed to `lightning.Trainer`'s [page](https://lightning.ai/docs/pytorch/stable/common/trainer.html). The sections on trainer [flags](https://lightning.ai/docs/pytorch/stable/common/trainer.html#trainer-flags) and its [API](https://lightning.ai/docs/pytorch/stable/common/trainer.html#trainer-class-api) are especially important. -> **_NOTE:_** it is in the `lightning.Trainer` that users can specify [`callbacks`](https://lightning.ai/docs/pytorch/stable/api_references.html#callbacks) used to influence the course of training. This includes the very important [`ModelCheckpoint`](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html#lightning.pytorch.callbacks.ModelCheckpoint) callback that should be configured to save checkpoint files in the way the user so pleases. `nequip`'s own [callbacks](../api/train.rst) can also be used here. +> **_NOTE:_** it is in the `lightning.Trainer` that users can specify [`callbacks`](https://lightning.ai/docs/pytorch/stable/api_references.html#callbacks) used to influence the course of training. This includes the very important [`ModelCheckpoint`](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html#lightning.pytorch.callbacks.ModelCheckpoint) callback that should be configured to save checkpoint files in the way the user so pleases. `nequip`'s own [callbacks](../api/callbacks.rst) can also be used here. ## `training_module` -TODO: explain `NequIPLightningModule` and `MetricsManager` +`training_module` defines the `NequIPLightningModule` (or its subclasses). Users are directed to its [API page](../api/lightning_module.rst) to learn how to configure it. It is here that the following parameters are defined + - the `model` + - the `loss` and `metrics` + - the `optimizer` and `lr_scheduler` ## `global_options` -TODO \ No newline at end of file +For now, `global_options` is used to specify + - `seed`, the global seed (in addition to the data seed and model seed) + - `allow_tf32`, which controls whether TensorFloat-32 is used \ No newline at end of file diff --git a/docs/guide/workflow.md b/docs/guide/workflow.md index f1a33b79..0779c841 100644 --- a/docs/guide/workflow.md +++ b/docs/guide/workflow.md @@ -18,7 +18,7 @@ Note that the flags `-cp` and `-cn` refer to the "config path" and "config name" Under the hood, the [Hydra](https://hydra.cc/) config utilities and the [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/) framework are used to facilitate training and testing in the NequIP infrastructure. One can think of the config as consisting of a set of classes to be instantiated with user-given parameters to construct objects required for training and testing to be performed. Hence, the API of these classes form the central source of truth in terms of what configurable parameters there are. These classes could come from - `torch` in the case of [optimizers and learning rate scheduler](https://pytorch.org/docs/stable/optim.html), or - `Lightning` such as Lightning's [trainer](https://lightning.ai/docs/pytorch/stable/common/trainer.html) or Lightning's native [callbacks](https://lightning.ai/docs/pytorch/stable/api_references.html#callbacks), or - - `nequip` itself such as the various [DataModules](../api/datamodule.rst), custom [callbacks](../api/train.rst), etc + - `nequip` itself such as the various [DataModules](../api/datamodule.rst), custom [callbacks](../api/callbacks.rst), etc Users are advised to look at `configs/tutorial.yaml` to understand how the config file is structured, and then to look up what each of the classes do and what parameters they can take (be they on `torch`, `Lightning` or `nequip`'s docs). The documentation for `nequip` native classes can be found under [Python API](../api/nequip.rst). @@ -44,7 +44,7 @@ There are two main ways users can use `test`. nequip-train -cp full/path/to/config/directory -cn config_name.yaml ++ckpt_path='path/to/ckpt_file' ``` -One can use `nequip.train.callbacks.TestTimeXYZFileWriter` ([see API](../api/train.rst)) as a callback to have `.xyz` files written with the predictions of the model on the test dataset(s). (This is the replacement for the role `nequip-evaluate` served before `nequip` version `0.7.0`) +One can use `nequip.train.callbacks.TestTimeXYZFileWriter` ([see API](../api/callbacks.rst)) as a callback to have `.xyz` files written with the predictions of the model on the test dataset(s). (This is the replacement for the role `nequip-evaluate` served before `nequip` version `0.7.0`) ## Deploying