Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Unload Models (max_loaded_models option isn't working) #62

Closed
Fieth-Ceboq opened this issue Nov 1, 2023 · 2 comments
Closed

Can't Unload Models (max_loaded_models option isn't working) #62

Fieth-Ceboq opened this issue Nov 1, 2023 · 2 comments
Labels
skill-issue Unrelated upstream application behavior

Comments

@Fieth-Ceboq
Copy link

Issue

While switching the models, the "VRAM in use" keeps increasing to a point where the application crashes with message "CUDA out of memory"

Potential Solution

The invoke.ai application seems to have a "--max_loaded_models 1" option, which isn't working when launched through nixified-ai:

nix flakes run github:nixified-ai/flake#invokeai-nvidia -- --free_gpu_mem 1 --max_loaded_models 1
Unknown args: ['--max_loaded_models', '1']

Steps to Reproduce

  1. Geneate an image, notice the "VRAM in use: x"
  2. Generate another image using the same model, notice "VRAM in use:" is still x
  3. Switch to another model, notice increase in "VRAM in use: y"
  4. Generate another image using the same model, notice "VRAM in use" is still y
  5. Switch to another model, notice increase in "VRAM in use: z"
  6. Switch to another model, application crashes with "CUDA out of memory"

Clarification/Request

How to ensure VRAM doesn't keep increasing when switching models?

@max-privatevoid
Copy link
Member

The --max_loaded_models has been deprecated/removed by InvokeAI upstream in version 3. You can use the --ram and --vram flags to control how much RAM and VRAM to allocate to the model cache instead.

@max-privatevoid max-privatevoid added the skill-issue Unrelated upstream application behavior label Nov 1, 2023
@max-privatevoid max-privatevoid closed this as not planned Won't fix, can't repro, duplicate, stale Nov 1, 2023
@Fieth-Ceboq
Copy link
Author

Thanks for the clarification.

I had allotted ~90% ram and vram during install time assuming they are for generation purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skill-issue Unrelated upstream application behavior
Projects
None yet
Development

No branches or pull requests

2 participants