Skip to content

[Bug] CheckpointHook After save_best is set, running val alone will cause an error #1587

@BayMaxBHL

Description

@BayMaxBHL

Prerequisite

Environment

All environments

Reproduces the problem - code sample

checkpoint=dict(
    type="CheckpointHook",
    by_epoch=False,
    interval=2000,
    max_keep_ckpts=1,
    save_best=["DepthMetric/abs_rel", "DepthMetric/rmse"],
    rule=["less", "less"],
),

runner.val()

Reproduces the problem - command or script

When I need to run val once (Debug), the code will tell me at the end that I have no save_best history

Reproduces the problem - error message

10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][124/125] eta: 0:00:00 time: 0.0564 data_time: 0.0008 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] eta: 0:00:00 time: 0.0561 data_time: 0.0007 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] DepthMetric/abs_rel: 0.7433 DepthMetric/sq_rel: 0.3333 DepthMetric/rmse: 0.4196 DepthMetric/rmse_log: 1.3231 DepthMetric/a1: 0.0834 DepthMetric/a2: 0.1743 DepthMetric/a3: 0.2699 data_time: 0.0170 time: 0.0793
Traceback (most recent call last):
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 56, in
demo()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 50, in demo
coal_dump_mde()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 35, in coal_dump_mde
run_command_script(
File "/home/baihanlin/Project/UMDE/DreamDE/tools/run_task.py", line 41, in run_command_script
runner.val()
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1800, in val
metrics = self.val_loop.run() # type: ignore
^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/loops.py", line 377, in run
self.runner.call_hook('after_val_epoch', metrics=metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1839, in call_hook
getattr(hook, fn_name)(self, **kwargs)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 361, in after_val_epoch
self._save_best_checkpoint(runner, metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 514, in _save_best_checkpoint
best_ckpt_path = self.best_ckpt_path_dict[key_indicator]
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'DepthMetric/abs_rel'

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions