rinongal/StyleGAN-nada

Small models for 11GB GPUs

Closed this issue · 7 comments

Hi. Thanks for opensourcing this amazing project. I am trying to train the network but I got OOM problem as I don't have any 16GB GPU. Could you please let me know which small models can I try on a 11GB GPU? Thanks so much!

Hey!

If you want to decrease memory use, the following are all viable options:

  1. Disable the layer freezing module by setting auto_layer_iters to 0. If you're only doing texture-based changes then you probably don't need to freeze layers and this can save you a good chunk of memory.
  2. Use a lower resolution model (FFHQ 256, LSUN Church etc.).
  3. Only use one of the two CLIP models (ViT-B/32 is better for global textures, ViT-B/16 is a bit better for local textures and shapes).
  4. Decrease n_sample (number of output images during training).

If you just want to play with the model and don't want to do things like dogs to cats, I'd start with options (1) and (4) since they might be enough. We managed to train an FFHQ 1024x1024 model on a 1080 Ti, so 11GB should probably be doable.

Hi @rinongal. Thanks for your tips. Indeed, (1) already reduced a lot of memory and made the training fit on a single 11GB GPU. However, it seems like the quality of the output is not as good as the original version which I checked via Colab. (3) actually didn't affect much as I observed. (4) alone cannot make the training possible either. I guess then (2) would be the most suitable solution if I want to keep the same translation quality, am I correct?

You could try combining (1) with lowering the learning rate and increasing the number of iterations. Some previous issues reported better results when reducing learning rates when training with style image targets. It might help in your case as well.

Other than that, I'm afraid (2) might be your best option for reducing memory requirements.

What options did you run in the Colab, btw? The layer freezing isn't enabled there by default (it's only turned on if you click on improve shape).

What options did you run in the Colab, btw? The layer freezing isn't enabled there by default (it's only turned on if you click on improve shape).

Oops the results I used as reference weren't with improve shape. I thought improve shape will enable mixing noise. So is the config without improve shape in Colab totally equivalent to (1)?

The config without improve shape in Colab is (1) + only ViT-B/32 (so (3)) and no mixing.

Closing due to lack of activity. Feel free to re-open if you need additional help.