orpatashnik/StyleCLIP

Mapper based on Official NVIDIA StyleGAN2 Implementation

piyushnags opened this issue · 1 comments

Hi,

Thank you for sharing your code and the weights for the models. I'm trying to reproduce the results presented in the paper using NVIDIA's Official StyleGAN2 model. In your project, you are using an unofficial implementation (from Rosinality) of StyleGAN2. I noticed that in your recent papers dealing with StyleGAN3, you use the official model provided by NVIDIA and also make a note that the latent space of StyleGAN3 is relatively more entangled compared to StyleGAN2 (hoping this is not the case with their official release of StyleGAN2).

Have you tried your method (latent mapper) on NVIDIA's StyleGAN2 model and obtained results similar to your paper? If so, could you share the hyperparameters? In case you have not (a more likely scenario), do you have any recommendations on how to conduct a hyperparameter search?

Using the default hyperparams (i.e., LR=0.5, optim=ranger, id_lambda=0.1, l2_lambda=0.8, etc.) I was not able to generate any reasonable edits for the "afro hairstyle" for example (generated images barely had any changes). I tried using id_lambda=0.04, l2_lambda=0.8 and replaced 0.1 (let's call this delta) with 1.5 in the w_hat computation:
w_hat = w + delta*M(w) ---> In your code this is fixed at 0.1, I am treating this as a hyperparameter.

I observed noticeable changes relevant to the prompt (somewhat noticeable changed to the hairstyle but nothing close to an afro for example). I would appreciate any help/suggestions.

Thanks

I am closing this issue since my bug was unrelated to the ideas or code proposed by the authors. I tried to optimize the memory footprint, but I accidentally disabled gradients during the computation of x_hat (didn't see any errors, just bad updates).

In any case, I was able to reproduce results of the same level using NVIDIA's StyleGAN 2 model using the same hyperparameters from the paper. Also, I used the latest version of the Ranger optimizer and it worked just fine (just an FYI for those interested).

I did notice that the implementation to train the latent mapper in the StyleSpace is incomplete.