Implement early stopping / validation patience interval. #375

Lilferrit · 2024-09-06T22:44:05Z

This is another QOL feature I implemented for the sake of my own experiments, but that might be nice add to the mainline Casanovo release. I added a new config option val_patience_interval that defaults to -1 (to mirror the functionality of max_epochs), but if val_patience_interval is set to a positive value then an early stopping callback is added to the model runner using PyLightning's EarlyStopping callback. This callback will monitor valid_CELoss and will stop model training if the valid_CELoss doesn't improve for val_patience_interval.

My implementation is on the branch val-early-stop. I also changed the best validation checkpoint filename from <root>.best.ckpt to <root>.<epoch>-<step>.best.ckpt. If we want to implement add the early stopping feature, but we don't want to change the best filename, I can remove this before submitting a PR.

The text was updated successfully, but these errors were encountered:

bittremieux · 2024-09-08T12:00:14Z

My implementation is on the branch val-early-stop. I also changed the best validation checkpoint filename from <root>.best.ckpt to <root>.<epoch>-<step>.best.ckpt. If we want to implement add the early stopping feature, but we don't want to change the best filename, I can remove this before submitting a PR.

I don't think that this is an ideal change. The reasoning behind the best.ckpt file was that its filename would always be the same, so that the user can immediately get it. Adding the epoch number removes this advantage.

While adding the early stopping patience is a small change that can make training a bit more convenient, one thing to make sure in your implementation is that it is defined in terms of the number of training steps, not epochs. When we're training on the full MassIVE-KB data, there is convergence even before a full epoch has been processed. Hence also why val_check_interval and some other training options are defined in terms of the number of steps.

Lilferrit added the enhancement New feature or request label Sep 6, 2024

bittremieux added this to the Casanovo v5.0.0 milestone Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement early stopping / validation patience interval. #375

Implement early stopping / validation patience interval. #375

Lilferrit commented Sep 6, 2024

bittremieux commented Sep 8, 2024

Implement early stopping / validation patience interval. #375

Implement early stopping / validation patience interval. #375

Comments

Lilferrit commented Sep 6, 2024

bittremieux commented Sep 8, 2024