Can't Unload Models (max_loaded_models option isn't working)

Question

Can't Unload Models (max_loaded_models option isn't working)

Fieth-Ceboq opened this issue 10 months ago · 2 comments

Issue

While switching the models, the "VRAM in use" keeps increasing to a point where the application crashes with message "CUDA out of memory"

Potential Solution

The invoke.ai application seems to have a "--max_loaded_models 1" option, which isn't working when launched through nixified-ai:

nix flakes run github:nixified-ai/flake#invokeai-nvidia -- --free_gpu_mem 1 --max_loaded_models 1
Unknown args: ['--max_loaded_models', '1']

Steps to Reproduce

Geneate an image, notice the "VRAM in use: x"
Generate another image using the same model, notice "VRAM in use:" is still x
Switch to another model, notice increase in "VRAM in use: y"
Generate another image using the same model, notice "VRAM in use" is still y
Switch to another model, notice increase in "VRAM in use: z"
Switch to another model, application crashes with "CUDA out of memory"

Clarification/Request

How to ensure VRAM doesn't keep increasing when switching models?

Answer 1 · 2023-11-01T16:03:27.000Z

The --max_loaded_models has been deprecated/removed by InvokeAI upstream in version 3. You can use the --ram and --vram flags to control how much RAM and VRAM to allocate to the model cache instead.

Answer 2 · 2023-11-02T13:33:06.000Z

Thanks for the clarification.

I had allotted ~90% ram and vram during install time assuming they are for generation purpose.