Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_pretrained_model dependency issues #522

Closed
IFenton opened this issue Feb 14, 2023 · 6 comments
Closed

load_pretrained_model dependency issues #522

IFenton opened this issue Feb 14, 2023 · 6 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@IFenton
Copy link
Contributor

IFenton commented Feb 14, 2023

I've been working on trying to get the Scivision example gallery notebooks running (#83, and I've come across an issue with load_pretrained_model that is causing some problems.

Running load_pretrained_model(<url>, allow_install=True), if the model is not present, currently force installs both the model and its dependencies. As I understand it, this was intended as an intentional feature to make sure that the dependencies were updated to the required versions (#223). However the force reinstalling of dependencies is leading to issues with some of the scivision-examples-gallery notebooks (e.g. scivision-basic-usage-example #5, plant-phenotyping-classification #8). In both these cases, running load_pretrained_model causes the latest version of packages (tensorflow and timm respectively) to be installed. However, these versions are causing the errors in running the notebooks.

There seem to be a couple of possible options to fix this:

  • Install the model separately, whether in the environment.yml or in the terminal before running the notebook
  • Update the model requirements.txt to a pinned version of the package
  • Change load_pretrained_model so it doesn't force reinstall dependencies by default

Does anyone have any preferences on which is the best option?

@IFenton IFenton added the bug Something isn't working label Feb 15, 2023
@edwardchalstrey1
Copy link
Contributor

This is a good question, and a bit of a fundamental one for scivision! My opinion has been that installing models via the scivision package is too inherently problematic for reasons like this and we should no longer support this - see #346 (which proved controversial).

I think the tension here is we could either:

  1. Make the postdocs notebooks just work somehow, e.g. one of @IFenton first two bullet points
  2. Take some time to consider what scivision "best practice" should be for installing models in a Python environment, then conform all the project notebooks to this way of working (which could still be the first option, via enironment.yml)

@edwardchalstrey1
Copy link
Contributor

There could even be a rule like "We only accept models for the scivision catalog that have pinned package versions" or as @ots22 has discussed previously, some kind of installation checks that run in GitHub actions

@ots22
Copy link
Member

ots22 commented Feb 15, 2023

There are a few general issues with model-loading-as-package-installation as noted by Ed. Fixing this bug should at least let us maintain a reproducible environment without load_pretrained_model clobbering it, even without addressing these wider issues, so some comments in that spirit.

It might not be appropriate to completely pin dependencies in a model repository (or use a lockfile from the model if one exists) for the usual reasons - the right place to pin things would seem to be in the environment.yml in the gallery example. A model should ensure that its dependencies are specified correctly.

We should pin the version of the model itself in the gallery example. The options for doing that are:

  1. as part of the load_pretrained_model command (e.g. include the commit hash when loading it)
  2. in environment.yml (in this case, allow_install would not need to be set when loading the model)

We should allow for either possibility if we can: option 2 means someone running the model doesn't have the installation appearing inline, and option 1 if we want to explicitly show off this behaviour. (Some guidance on this could be useful to capture perhaps something like - when working interactively you might want to use allow_install=True but as soon as you start specifying dependencies in a virtual environment of some kind you should include your models there too - I'm not sure if that's right, but probably for a separate discussion.)

It seemed like (from our discussion the other day) the 'force' flag of pip is inherited when installing dependencies and this is the real culprit, since it is then no longer possible to use option 1 above (it might result in an 'upgraded' version of the package for instance, different from the version in environment.yml). Have you tried passing the 'force' flag to pip only when allow_install='force'? I think we had assumed that there was no downside to this flag if the package wasn't installed.

So I'd say it would be nice to end up with:

  • load_pretrained_model(..., allow_install=True) not clobbering versions of other packages in the environment if the environment is consistent with the model dependencies
  • having pinned versions of everything in the gallery examples, including the model - both options above should work - making a decision separately about which is preferable

@acocac acocac added this to the Core features milestone Apr 6, 2023
@ots22
Copy link
Member

ots22 commented Aug 29, 2023

Did #533 fix this?

(@IFenton)

@IFenton
Copy link
Contributor Author

IFenton commented Aug 29, 2023

It fixed the specific problem of force updating to the latest versions, although I think the broader discussion of how to handle the dependencies are still unresolved

@ots22
Copy link
Member

ots22 commented Aug 29, 2023

Okay, will close based on having fixed the force install problem.

Is there a summary of the broader problem somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

No branches or pull requests

4 participants