-
Notifications
You must be signed in to change notification settings - Fork 15
Train from step #172
-
Hi, Iʻm finetuning Unbabelʻs TowerInstruct and I want to know how to train from a step instead of training from scratch again. |
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment · 7 replies
-
Not sure what you mean. "finetuning" is the mutually exclusive with "training from scratch". Training from scratch means you initialize the model with random weights, and go from there. If you are using the Maybe your issue is more about learning rate schedulers? If so, give more details on what you want to achieve. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Yes I did figure out I need to use |
Beta Was this translation helpful? Give feedback.
All reactions
-
Share your logs and config. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi François, here is my config:
And hereʻs a sample of the training logs:
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Ok, your issue is related to the LoRa finetuning technique. This technique allows to finetune bigger models with limited VRAM by only finetuning part of the weights. But it requires some additional steps afterwards. The easiest is probably to merge your finetuned weights with the original model before continuing the training. I don't think we have an easier way right now. (Main idea is that saving the full model each time you save a checkpoint is not really efficient, so we only save the LoRa weights, and the merging happens later at the user's discretion.) |
Beta Was this translation helpful? Give feedback.
All reactions
-
Ah now I understand. I merged the LoRa weights with the base model for inference but didnʻt realize I have to do that to resume the training as well. |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
Ok, your issue is related to the LoRa finetuning technique. This technique allows to finetune bigger models with limited VRAM by only finetuning part of the weights. But it requires some additional steps afterwards.
Check the
lora_weights
tool -- https://github.com/eole-nlp/eole/blob/main/eole/bin/model/lora_weights.pyThe easiest is probably to merge your finetuned weights with the original model before continuing the training. I don't think we have an easier way right now. (Main idea is that saving the full model each time you save a checkpoint is not really efficient, so we only save the LoRa weights, and the merging happens later at the user's discretion.)