LAMMPS speed in CPU #753
Replies: 6 comments 3 replies
-
Hey,
|
Beta Was this translation helpful? Give feedback.
-
Clearly your vacancy minimisation that you show in the graph above went very wrong. You are using "average" E0s, we don't recommend it, it gives rise to unstable potentials. Please calculate the isolated atom energies with your chosen DFT method (in a big box if you are using a plane wave code), and add them to the training set. For the timing, you are comparing a code designed for GPU running on CPU with another code designed for GPU (Allegro) running on GPU ? or also running Allegro on CPU ? For comparing the times, it is best to compare how long a single iteration takes. But in any case let's fix the explosion first. |
Beta Was this translation helpful? Give feedback.
-
Also, it would be helpful if you tried running the foundation model (MACE-MP-0) on your system. |
Beta Was this translation helpful? Give feedback.
-
Also, you cutoff is very small (just 3Å). You need a longer cutoff to capture second neighbour interactions, something like 5Å is typical |
Beta Was this translation helpful? Give feedback.
-
Hello, one observation is that the current best strategy for CPU is to use one MPI rank with threading and the no_domain_decomposition option. Note this part at the bottom of the docs
From your input, I worry you may be using |
Beta Was this translation helpful? Give feedback.
-
Hello everyone, I tried with Observation 1: log_single_cpu_MACE_alumina_model_stagetwo_compiled.txt Observation 2: Each epoch takes 1-2 hours with a V100 GPU (32GB memory) for a dataser of 16000 frames with each frame 77-80 atoms. I was wondering if it is a normal speed and if I should increase the batch side to increase the training speed. Please give your suggestions on this. Thanks. Model.log file:
MACE training config used:
|
Beta Was this translation helpful? Give feedback.
-
Hello MACE developers,
I trained a model for Lammps CPU. However, the lammps run is too slow. Thus, I am wondering if I am missing anything here. Thanks for the help.
Q1. The speed for minimization for a simulation for 160 atoms seems very slow to me. For the same dataset, lammps simulation required the following time:
Q2. The MACE-trained potential is showing non-physical results for our case. Just to provide some context, the training dataset includes vacancy information (16k frames in total; 12k of them have vacancy), and when trying to run in lammps with vacancy it never reaches equilibrium. I tested this with other equivarient potentials (e.g. Allegro), and it managed to reach equilibrium within 160 steps, as shown in the figure here.
5 by 5 by 5 cell with vacancy:
1 CPU without vacancy (160 atoms):
8 CPU without vacancy (160 atoms):
MACE training config used:
Converting GPU model (.model file of 6 MB )to CPU model (.model file of 6 MB)
Converting CPU model (.model file of 6 MB )to lammps compatible file(.pt file of 12 MB)
Beta Was this translation helpful? Give feedback.
All reactions