Skip to content

Add a feature to save/load a learned model #23

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
micedre opened this issue Jan 17, 2025 · 3 comments · May be fixed by #37
Open

Add a feature to save/load a learned model #23

micedre opened this issue Jan 17, 2025 · 3 comments · May be fixed by #37

Comments

@micedre
Copy link
Contributor

micedre commented Jan 17, 2025

We need to be able to save a model after train to reuse it later. Maybe by using : https://pytorch.org/tutorials/beginner/saving_loading_models.html

@meilame-tayebjee
Copy link
Contributor

  • Saving is handled by lightning trainer
  • loading too is mainly handled by the lightning module, the method load_from_checkpoint is implemented in torchFastText to load smoothly from a checkpoint path

@micedre
Copy link
Contributor Author

micedre commented Feb 24, 2025

I'm reopening this, I think there could be some simplification in the load_from_checkpoint method.

At the moment, loading a previously trained model needs the training texts (to build the tokenizer in the state required), the exact config for the training (which can be loaded from json), the checkpoint folder (from lightning), and finally to set the flag trained to True :

torchft_model = torchFastText.from_json('torchft_model')
torchft_model.build(np.asarray(X_train), lr=0.1)
torchft_model.load_from_checkpoint('lightning_logs/version_1/checkpoints/epoch=0-step=1705.ckpt')
torchft_model.trained=True

Did I miss something ?

Omitting the requirement to have the training texts would make the models more portable across environment I think.

@micedre micedre reopened this Feb 24, 2025
@meilame-tayebjee meilame-tayebjee linked a pull request Feb 24, 2025 that will close this issue
@micedre
Copy link
Contributor Author

micedre commented Feb 25, 2025

Another thing : Loading a model trained with CUDA and using predict gives the following error :

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Adding the line self.pytorch_model.to(X.device) in predict() seems to solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants