Skip to content

Conversation

psychedelicious
Copy link
Collaborator

Summary

Our model installation process is often a pain point for users. We require models to be identified during installation, else we error and do not install them. There are a few core problems with this approach:

  • Our model identification logic is based on heuristics. We have an old "probe" API and new "classify" API. At a high level, these are very similar - examine the model folder structure and/or weights and try to figure out what kind of model it is. If we cannot positively identify the model, we throw an error.
  • When installation fails, we discard the model files. When users download models via the Invoke model manager, a failure means they must re-download the model to try again.
  • An unidentifiable model is not necessarily a model we cannot run. It's possible that the heuristics fail, but if we just got the dang model into a node, it would work.

This PR loosens the model install process a bit and cleans up some of the related model manager UI:

  • Add Unknown to the model base, type and format enums.
  • Add UnknownModelConfig as a fallback class. When we fail to identify a model, instead of raising and error and deleting the candidate model files, we go ahead and install it, but set its base, type and format to unknown.
    • When this fallback happens, the user will get a toast:
    image
  • Add allow_unknown_models setting to feature-flag the new fallback-to-unknown behaviour. It is enabled by default. To opt out of the new behaviour, set it to False.
  • Update the model manager UI to allow users to set the base, type, and format for all models. This allows the user to fix unidentified or misidentified models on their own.
    • Previously, we only let users change base. We hadn't reviewed this part of the model manager in a while, as evidenced by only SD and FLUX bases being options in the list of bases. I recall we made a decision that users should not be allowed to change model type and format, believing it could lead to footguns. For example, a user might inadvertently change the base of a model and get generation errors. That decision was made in simpler times, likely when we only had to worry about SD1.5 and SDXL. Given the variety of models we want to support today, I think it is better to have this graceful fallback for model installation and give users the tools to fix it themselves.

QA Instructions

Try installing a model that isn't supported by Invoke's model manager. You could use this one: https://huggingface.co/facebook/sam2-hiera-tiny/resolve/main/model.safetensors

It's the "tiny" variant of SAM 2, weighs ~150MB. You should get the toast and be able to edit the model.

Merge Plan

n/a

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files Root backend PRs that change backend files services PRs that change app services frontend PRs that change frontend files labels Sep 18, 2025
@github-actions github-actions bot added the python-tests PRs that change python tests label Sep 18, 2025
@psychedelicious psychedelicious force-pushed the psyche/feat/mm/unknown-model branch from 6bf6966 to 6c2da8c Compare September 18, 2025 11:51
@psychedelicious psychedelicious force-pushed the psyche/feat/mm/unknown-model branch from 6c2da8c to 4070f26 Compare September 18, 2025 11:56
@psychedelicious
Copy link
Collaborator Author

This PR also does a bit of cleanup in the frontend for models:

  • Centralize data like categories of models to show in the model manager, human-readable names for models, their colors, etc.
  • Ensure we use zod schemas for all model-related data.
  • Add simple type equality tests for zod schemas and the corresponding autogenerated types. This will cause TS errors if the more model types/bases are added and the frontend isn't updated.
  • Use the new centralized location's data to dynamically generate lists of models in model manager tab.

@psychedelicious psychedelicious marked this pull request as draft September 19, 2025 11:42
psychedelicious added a commit that referenced this pull request Sep 24, 2025
Previously, we had a multi-phase strategy to identify models from their
files on disk:
1. Run each model config classes' `matches()` method on the files. It
checks if the model could possibly be an identified as the candidate
model type. This was intended to be a quick check. Break on the first
match.
2. If we have a match, run the config class's `parse()` method. It
derive some additional model config attrs from the model files. This was
intended to encapsulate heavier operations that may require loading the
model into memory.
3. Derive the common model config attrs, like name, description,
calculate the hash, etc. Some of these are also heavier operations.

This strategy has some issues:
- It is not clear how the pieces fit together. There is some
back-and-forth between different methods and the config base class. It
is hard to trace the flow of logic until you fully wrap your head around
the system and therefore difficult to add a model architecture to the
probe.
- The assumption that we could do quick, lightweight checks before
heavier checks is incorrect. We often _must_ load the model state dict
in the `matches()` method. So there is no practical perf benefit to
splitting up the responsibility of `matches()` and `parse()`.
- Sometimes we need to do the same checks in `matches()` and `parse()`.
In these cases, splitting the logic is has a negative perf impact
because we are doing the same work twice.
- As we introduce the concept of an "unknown" model config (i.e. a model
that we cannot identify, but still record in the db; see #8582), we will
_always_ run _all_ the checks for every model. Therefore we need not try
to defer heavier checks or resource-intensive ops like hashing. We are
going to do them anyways.
- There are situations where a model may match multiple configs. One
known case are SD pipeline models with merged LoRAs. In the old probe
API, we relied on the implicit order of checks to know that if a model
matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in
the new API, we do not have this implicit ordering of checks. To resolve
this in a resilient way, we need to get all matches up front, then use
tie-breaker logic to figure out which should win (or add "differential
diagnosis" logic to the matchers).
- Field overrides weren't handled well by this strategy. They were only
applied at the very end, if a model matched successfully. This means we
cannot tell the system "Hey, this model is type X with base Y. Trust me
bro.". We cannot override the match logic. As we move towards letting
users correct mis-identified models (see #8582), this is a requirement.

We can simplify the process significantly and better support "unknown"
models.

Firstly, model config classes now have a single `from_model_on_disk()`
method that attempts to construct an instance of the class from the
model files. This replaces the `matches()` and `parse()` methods.

If we fail to create the config instance, a special exception is raised
that indicates why we think the files cannot be identified as the given
model config class.

Next, the flow for model identification is a bit simpler:
- Derive all the common fields up-front (name, desc, hash, etc).
- Merge in overrides.
- Call `from_model_on_disk()` for every config class, passing in the
fields. Overrides are handled in this method.
- Record the results for each config class and choose the best one.

The identification logic is a bit more verbose, with the special
exceptions and handling of overrides, but it is very clear what is
happening.

The one downside I can think of for this strategy is we do need to check
every model type, instead of stopping at the first match. It's a bit
less efficient. In practice, however, this isn't a hot code path, and
the improved clarity is worth far more than perf optimizations that the
end user will likely never notice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files frontend PRs that change frontend files python PRs that change python files python-tests PRs that change python tests Root services PRs that change app services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants