From 7705a7efec330737b4bdb11a4801541e2e451aaf Mon Sep 17 00:00:00 2001 From: icedoom888 Date: Wed, 18 Dec 2024 19:40:00 +0100 Subject: [PATCH] Added documentation --- docs/user-guide/training.rst | 62 ++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/docs/user-guide/training.rst b/docs/user-guide/training.rst index e90b1583..6eda7f01 100644 --- a/docs/user-guide/training.rst +++ b/docs/user-guide/training.rst @@ -280,3 +280,65 @@ finished training. However if the user wants to restart the model from a specific point they can do this by setting ``config.hardware.files.warm_start`` to be the checkpoint they want to restart from.. + +******************* + Transfer Learning +******************* + +Transfer learning allows the model to reuse knowledge from a previously +trained checkpoint. This is particularly useful when the new task is +related to the old one, enabling faster convergence and often improving +model performance. + +To enable transfer learning, set the config.training.transfer_learning +flag to True in the configuration file. + +.. code:: yaml + + training: + # start the training from a checkpoint of a previous run + fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b' + load_weights_only: True + transfer_learning: True + +When this flag is active and a checkpoint path is specified in +config.hardware.files.warm_start or self.last_checkpoint, the system +loads the pre-trained weights using the `transfer_learning_loading` +function. This approach ensures only compatible weights are loaded and +mismatched layers are handled appropriately. + +For example, transfer learning might be used to adapt a weather +forecasting model trained on one geographic region to another region +with similar characteristics. + +**************** + Model Freezing +**************** + +Model freezing is a technique where specific parts (submodules) of a +model are excluded from training. This is useful when certain parts of +the model have been sufficiently trained or should remain unchanged for +the current task. + +To specify which submodules to freeze, use the +config.training.submodules_to_freeze field in the configuration. List +the names of submodules to be frozen. During model initialization, these +submodules will have their parameters frozen, ensuring they are not +updated during training. + +For example with the following configuration, the processor will be +frozen and only the encoder and decoder will be trained: + +.. code:: yaml + + training: + # start the training from a checkpoint of a previous run + fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b' + load_weights_only: True + + submodules_to_freeze: + - processor + +Freezing can be particularly beneficial in scenarios such as fine-tuning +when only specific components (e.g., the encoder, the decoder) need to +adapt to a new task while keeping others (e.g., the processor) fixed.