Replies: 2 comments
-
Please check out the last paragraph of https://docs.deepmodeling.com/projects/dpgen/en/latest/run/overview-of-the-run-process.html#overview-of-the-run-process |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you very much for your help. I wish you all the best. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Summary
How do I restart dpgen_run from the model_devi_job of an iteration?
DP-GEN Version
0.12.1
Platform, Python Version, etc
No response
Details
Hello,
I encountered an issue while using dpgen_run and would greatly appreciate your assistance. Below is my param.json:
{
"type_map": ["O", "Al"],
"mass_map": [16, 27],
"init_data_prefix": "./",
"init_data_sys": [
"alpha-al2o3/POSCAR.01x01x01/02.md/sys-0012-0018/deepmd",
"gamma-al2o3/POSCAR.01x01x01/02.md/sys-0016-0024/deepmd",
"kappa-al2o3/POSCAR.01x01x01/02.md/sys-0016-0024/deepmd",
"theta-al2o3/POSCAR.01x01x01/02.md/sys-0008-0012/deepmd"
],
"sys_configs_prefix": "./",
"sys_configs": [
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00000*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00001*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00002*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00003*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00004*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00005*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00006*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00007*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00008*/POSCAR"],
["alpha-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0012-0018/scale-1.000/00009*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00000*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00001*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00002*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00003*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00004*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00005*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00006*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00007*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00008*/POSCAR"],
["gamma-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00009*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00000*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00001*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00002*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00003*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00004*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00005*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00006*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00007*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00008*/POSCAR"],
["kappa-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0016-0024/scale-1.000/00009*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00000*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00001*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00002*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00003*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00004*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00005*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00006*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00007*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00008*/POSCAR"],
["theta-al2o3/POSCAR.01x01x01/01.scale_pert/sys-0008-0012/scale-1.000/00009*/POSCAR"]
],
"_comment": " that's all ",
"numb_models": 4,
"default_training_param": {
"model": {
"type_map": ["O", "Al"],
"descriptor": {
"type": "se_a",
"sel": [200, 400],
"rcut_smth": 0.5,
"rcut": 6.0,
"neuron": [25, 50, 100],
"resnet_dt": true,
"axis_neuron": 12,
"seed": 1
},
"fitting_net": {
"neuron": [240, 240, 240],
"resnet_dt": false,
"seed": 1
}
},
"learning_rate": {
"type": "exp",
"start_lr": 0.002,
"decay_steps": 5000
},
"loss": {
"start_pref_e": 0.02,
"limit_pref_e": 2,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0.0,
"limit_pref_v": 0.0
},
"training": {
"stop_batch": 500000,
"disp_file": "lcurve.out",
"disp_freq": 1000,
"numb_test": 4,
"save_freq": 10000,
"save_ckpt": "model.ckpt",
"disp_training": true,
"time_training": true,
"profiling": false,
"profiling_file": "timeline.json",
"_comment": "that's all"
}
},
"model_devi_dt": 0.002,
"model_devi_skip": 0,
"model_devi_f_trust_lo": 0.20,
"model_devi_f_trust_hi": 0.35,
"model_devi_e_trust_lo": 10000000000.0,
"model_devi_e_trust_hi": 10000000000.0,
"model_devi_clean_traj": true,
"model_devi_jobs": [
{
"sys_idx": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
"temps": [300, 900, 1500, 2100, 2400, 2700, 3000],
"press": [0, 10000, 30000, 50000, 80000, 100000],
"trj_freq": 50,
"nsteps": 5000,
"ensemble": "npt",
"_idx": "alpha"
},
{
"sys_idx": [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
"temps": [300, 900, 1500, 2000, 2400, 3000],
"press": [-50000, 0, 10000, 50000, 100000],
"trj_freq": 50,
"nsteps": 5000,
"ensemble": "npt",
"_idx": "gamma",
"restart_from_iter": 1
},
{
"sys_idx": [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
"temps": [500, 1000, 1400, 1800, 2000, 2400],
"press": [0, 10000, 50000, 100000],
"trj_freq": 50,
"nsteps": 5000,
"ensemble": "npt",
"_idx": "kappa"
},
{
"sys_idx": [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
"temps": [300, 900, 1500, 2000],
"press": [-50000, -10000, 0, 10000],
"trj_freq": 50,
"nsteps": 5000,
"ensemble": "npt",
"_idx": "theta"
}
],
"fp_style": "vasp",
"shuffle_poscar": false,
"fp_task_max": 50,
"fp_task_min": 5,
"fp_pp_path": "./",
"fp_pp_files": ["POTCAR_O", "POTCAR_Al"],
"fp_incar": "./INCAR"
}
As you can see from my settings, a total of 4 iteration folders will be generated (i.e., iter.000000, iter.000001, iter.000002, iter.000003). I have completed the first iteration, but during the second iteration, a LAMMPS task in the model_devi part encountered an error, causing the entire dpgen task to terminate.
I would like to restart the dpgen run. Could you please tell me how to restart this task? Since the training process for the second iteration has already been completed (which is very time-consuming), I hope to restart the task directly from the model_devi part. Additionally, I have identified that the error in LAMMPS was caused by unreasonable simulation parameters. Therefore, apart from modifying the param.json, is there anything else I need to do before restarting the dpgen run?
Thank you very much, and I look forward to your reply.
Beta Was this translation helpful? Give feedback.
All reactions