feat: Add finetune method for MatterSim #68
+371
−21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Logger Enhancements
Installation Update
wandb
to the installation requirements.Dataset Construction
vasprun.xml
to.xyz
files.Enhancements in
potential.py
src/mattersim/forcefield/potential.py
: Integrated a new logger utility, added early stopping based on the best metric epoch, and updated the training process to support distributed training and early stopping.New Scripts:
script/finetune_mattersim.py
: Added a new script for fine-tuning the MatterSim model with support for distributed training and logging.script/vasprun_to_xyz.py
: Added a new script to convert VASP output files to XYZ format, including splitting data into training, validation, and test sets.I tested the finetune method on my custom H-structure dataset. The process involved generating
.xyz
files fromvasprun.xml
and then finetuning the model using the following command:I have train it in 4xV100 GPUs and the training process took around 1 hours. The model was able to satisfy my expectations and I am happy with the results.