Skip to content

Commit

Permalink
Added documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
icedoom888 committed Dec 18, 2024
1 parent 8797fb3 commit 7705a7e
Showing 1 changed file with 62 additions and 0 deletions.
62 changes: 62 additions & 0 deletions docs/user-guide/training.rst
Original file line number Diff line number Diff line change
Expand Up @@ -280,3 +280,65 @@ finished training. However if the user wants to restart the model from a
specific point they can do this by setting
``config.hardware.files.warm_start`` to be the checkpoint they want to
restart from..

*******************
Transfer Learning
*******************

Transfer learning allows the model to reuse knowledge from a previously
trained checkpoint. This is particularly useful when the new task is
related to the old one, enabling faster convergence and often improving
model performance.

To enable transfer learning, set the config.training.transfer_learning
flag to True in the configuration file.

.. code:: yaml
training:
# start the training from a checkpoint of a previous run
fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b'
load_weights_only: True
transfer_learning: True
When this flag is active and a checkpoint path is specified in
config.hardware.files.warm_start or self.last_checkpoint, the system
loads the pre-trained weights using the `transfer_learning_loading`
function. This approach ensures only compatible weights are loaded and
mismatched layers are handled appropriately.

For example, transfer learning might be used to adapt a weather
forecasting model trained on one geographic region to another region
with similar characteristics.

****************
Model Freezing
****************

Model freezing is a technique where specific parts (submodules) of a
model are excluded from training. This is useful when certain parts of
the model have been sufficiently trained or should remain unchanged for
the current task.

To specify which submodules to freeze, use the
config.training.submodules_to_freeze field in the configuration. List
the names of submodules to be frozen. During model initialization, these
submodules will have their parameters frozen, ensuring they are not
updated during training.

For example with the following configuration, the processor will be
frozen and only the encoder and decoder will be trained:

.. code:: yaml
training:
# start the training from a checkpoint of a previous run
fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b'
load_weights_only: True
submodules_to_freeze:
- processor
Freezing can be particularly beneficial in scenarios such as fine-tuning
when only specific components (e.g., the encoder, the decoder) need to
adapt to a new task while keeping others (e.g., the processor) fixed.

0 comments on commit 7705a7e

Please sign in to comment.