carefree0910/carefree-creator

What are the minimum ram requirements?

danielrh opened this issue · 8 comments

I have an 8 Gig GPU... I suspect it's not enough because I run into

uvicorn apis.interface:app --host 0.0.0.0 --port 8123
ldm_sd_v1.5: 4.27GB [06:19, 11.2MB/s]                                           
ldm_sd_anime_nai: 4.27GB [06:19, 11.3MB/s]                                      
ldm.sd_inpainting: 4.27GB [06:17, 11.3MB/s]                                     
Traceback (most recent call last):
...
  File "/home/danielrh/dev/carefree-creator/cfe9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 662, in _apply
    param_applied = fn(param)
  File "/home/danielrh/dev/carefree-creator/cfe9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 985, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 7.92 GiB total capacity; 7.36 GiB already allocated; 67.56 MiB free; 7.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Application startup failed. Exiting.

I was able to run stable diffusion vanilla--does this require additional GPU ram?

Ah, my bad, I should put this in the README as well...

I've provided an option here to exchange GPU RAM with RAM, uncomment this line will first load the models to RAM and then use GPU RAM only when needed!

The reason why this project requires some more GPU RAM than the SD vanilla is that - It actually integrates FOUR different SD versions together, and many other models as well 🤣.

BTW, if you want to focus on the SD vanilla features, you can comment out the following lines, which will also reduce the GPU RAM usage!

wow so cool! It seems to be loaded now! Thanks for the help! I'm using the OPT because I do want to see the features together, especially all the img2img-related features.

That's great 🥳!

I did not turn on the OPT by default because it eats too much RAM that the Google Colab cannot afford it 🤣.

@carefree0910 I have an 8GB GPU (RTX2070) & 16 GB RAM. At launch with '--lazy' argument, I have 12.3 GB RAM available and 7.5 GB GPU ram. GPU ram increases to around 6500 MG used (as reported by NVIDIA Inspector) and I then get:

lib\site-packages\torch\serialization.py", line 1112, in load_tensor
    storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 3276800 bytes.

Application startup failed. Exiting.

There is minimal usage of CPU RAM during this process. Automatic1111 with several extensions runs fine. Any suggestions as to why it seems CPU RAM isn't been used? T

@aleph23 Hi! This project has one major difference from the Automatic1111: it launches MANY models at the same time, so it will eat up much more resources.

There is a workaround though:

cfcreator serve --limit 1

Which means you'll only load 1 model and leave everything else on disk. (In this case, it'll perform more alike to the Automatic1111!)

However, in my personal experience I found that there are some memory leaks around. I'm currently using gc.collect() and maybe I left some references to the models which stops Python from freeing the memory.

*how to run this bro :)

*how to run this bro :)

The Goole Colab should be working now!