Skip to content

deepmodeling/dpdata

Repository files navigation

dpdata

DOI:10.1021/acs.jcim.5c01767 conda-forge pip install Documentation Status

dpdata is a Python package for manipulating atomistic data of software in computational science.

Credits

If you use this software, please cite the following paper:

  • Jinzhe Zeng, Xingliang Peng, Yong-Bin Zhuang, Haidi Wang, Fengbo Yuan, Duo Zhang, Renxi Liu, Yingze Wang, Ping Tuo, Yuzhi Zhang, Yixiao Chen, Yifan Li, Cao Thang Nguyen, Jiameng Huang, Anyang Peng, Marián Rynik, Wei-Hong Xu, Zezhong Zhang, Xu-Yuan Zhou, Tao Chen, Jiahao Fan, Wanrun Jiang, Bowen Li, Denan Li, Haoxi Li, Wenshuo Liang, Ruihao Liao, Liping Liu, Chenxing Luo, Logan Ward, Kaiwei Wan, Junjie Wang, Pan Xiang, Chengqian Zhang, Jinchao Zhang, Rui Zhou, Jia-Xin Zhu, Linfeng Zhang, Han Wang, dpdata: A Scalable Python Toolkit for Atomistic Machine Learning Data Sets, J. Chem. Inf. Model., 2025, DOI: 10.1021/acs.jcim.5c01767. Citations

Installation

dpdata only supports Python 3.8 and above. You can setup a conda/pip environment, and then use one of the following methods to install dpdata:

  • Install via pip: pip install dpdata
  • Install via conda: conda install -c conda-forge dpdata
  • Install from source code: git clone https://github.com/deepmodeling/dpdata && pip install ./dpdata

To test if the installation is successful, you may execute

dpdata --version

Supported packages

dpdata is aimmed to support different kinds of atomistic packages:

  • Atomistic machine learning packages, such as DeePMD-kit;
  • Molecular dynamics packages, such as LAMMPS and GROMACS;
  • Quantum chemistry packages, such as VASP, Gaussian, and ABACUS;
  • Atomistic visualization packages, such as 3Dmol.js.
  • Other atomistic tools, such as ASE.
  • Common formats such as xyz.

All supported formats are listed here.

Quick start

The quickest way to convert a simple file from one format to another one is to use the command line.

dpdata OUTCAR -i vasp/outcar -o deepmd/npy -O deepmd_data

For advanced usage with Python APIs, read dpdata documentation.

Plugins

  • cp2kdata adds the latest CP2K support for dpdata.

For how to create your own plugin packages, read dpdata documentation.