-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuna crashes after the first DFT calculation #46
Comments
@bjkreitz I think this issue is related with vasp-interactive. It could be vasp-interactive isn't compatible with your local vasp build so that the parsing stopped. Another possibility is maybe related to multiple node MPI. In vasp-interactive we're using Could you simply test if able to use Also are the OUTCAR, vasprun.xml and vasp.out files somehow truncated in your setup? |
If so let's raise the issue in https://github.com/ulissigroup/vasp-interactive instead. You're likely to overcome the issue by switching |
When I run the relaxation just with vasp-interactive and identical settings it seems to work. A few optimization steps were performed without a crash. |
@bjkreitz Thanks for the testing. So it seems the vasp build you have is compatible and the issue might be related with MPI pausing on multiple nodes. Could you try some simple test like this on your setup (just atoms + VaspInteractive, no finetuna involved) import time
atoms.calc = VaspInteractive(**params)
atoms.get_potential_energy()
# Potentially not working on multiple nodes?
with atoms.calc.pause():
time.sleep(5)
# Just simulate a second step
atoms.rattle(0.01)
atoms.get_potential_energy()
atoms.calc.finalize() |
Yea when I try your example it fails after computing the first potential energy. This fails on multiple nodes but also on a single node with 16 cpus due to an MPI issue. Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): |
@bjkreitz Thanks for the test! Yes it seems the way Meanwhile if you're ok with testing finetuna, simply switching |
Yes switching |
Issue
I tried to run FINETUNA with VASP 6.3.0 to relax H*CO on Pt(111) using the provided ASE example template (no 1). 10 steps are performed with the MLP and then a DFT calculation is triggered. However, after the DFT calculation converges, the software crashes and reports the following error message:
Trying to close the VASP stream but encountered error:
process PID not found (pid=181196)
Will now force closing the VASP process. The OUTCAR and vasprun.xml outputs may be incomplete
Force below threshold: check with parent
OnlineLearner: Parent calculation required
Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 442, in wrapper
ret = self._cache[fun]
AttributeError: _cache
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1642, in wrapper
return fun(self, *args, **kwargs)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 445, in wrapper
return fun(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1684, in _parse_stat_file
data = bcat("%s/%s/stat" % (self._procfs_path, self.pid))
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 775, in bcat
return cat(fname, fallback=fallback, _open=open_binary)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 763, in cat
with _open(fname) as f:
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 727, in open_binary
return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/181196/stat'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 361, in _init
self.create_time()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 714, in create_time
self._create_time = self._proc.create_time()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1642, in wrapper
return fun(self, *args, **kwargs)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1852, in create_time
ctime = float(self._parse_stat_file()['create_time'])
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1649, in wrapper
raise NoSuchProcess(self.pid, self._name)
psutil.NoSuchProcess: process no longer exists (pid=181196)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/gpfs/data/cfgoldsm/bkreitz1/VASP/methane-oxidation/neb/h--co-diss/IS/finetuna/example.py", line 106, in
relaxer.run(
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/atomistic_methods.py", line 198, in run
dyn.run(fmax=self.fmax, steps=self.steps)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 294, in run
return Dynamics.run(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 181, in run
for converged in Dynamics.irun(self):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 168, in irun
self.log()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 308, in log
forces = self.atoms.get_forces()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/atoms.py", line 790, in get_forces
forces = self._calc.get_forces(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/calculators/abc.py", line 23, in get_forces
return self.get_property('forces', atoms)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/calculators/calculator.py", line 736, in get_property
self.calculate(atoms, [name], system_changes)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 189, in calculate
energy, forces, fmax = self.get_energy_and_forces(atoms)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 259, in get_energy_and_forces
energy, forces, constrained_forces = self.add_data_and_retrain(
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 491, in add_data_and_retrain
self.parent_calc._pause_calc()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/vasp_interactive/vasp_interactive.py", line 471, in _pause_calc
mpi_process = _find_mpi_process(pid)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/vasp_interactive/vasp_interactive.py", line 65, in _find_mpi_process
process_list = [psutil.Process(pid)]
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 332, in init
self._init(pid)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 373, in _init
raise NoSuchProcess(pid, msg='process PID not found')
psutil.NoSuchProcess: process PID not found (pid=181196)
Trying to close the VASP stream but encountered error:
'psutil'
Software
OpenMPI 4.0.5
Intel 2020.2
Python 3.9.0
Executed on 2 nodes, with 2 tasks per node and 8 cpus per task (not sure if that's relevant)
I adjusted the vasp calculator as follows:
The text was updated successfully, but these errors were encountered: