II2S_2

Clone this repo https://github.com/rosinality/stylegan2-pytorch.
Add these files to stylegan2-pytorch

How do random samples map from the Z space to the W and P spaces?

Inverting a random image into the different spaces through optimization, using the LPIPS loss!

Z Space

W Sapce

W+ Space

Pn Space Inverting a random image using the LPIPS loss and Pn Loss!

lambda = 0.01

lambda = 0.001

Idea behind e4e

Minimize the variation of the 18 latent codes

predict a single latent code (and offsets for the other 17 codes)
make the offsets as small as possible using L2 regularization

Encourage each individual style code to be within the W distribution

a discriminator learns to distinguish between the real latent vectors sampled from StyleGAN’s mapping network and fakes ones from the encoder

Idea behind PTI

Use direct optimization to invert the image and obtain the pivotal latent code
Tune the generator to generate the input image given the latent code in the previous step

They also use locality regularization to make the tuning effects localized and keep the StyleGAN latent space semantically editable.

Editability Evaluation:

In our experiments, we noticed that I2S2 with the Pn regularization loss struggles to get editable images when the image resolution is increased to 1024x1024. Other than obvious perceptual loss reason, we believe that changing hyperparameters slightly led to inconsistent results. We tried to investigate different initialization techniques and different hyperparameters. We have noticed that changing the number of components in the PCA transformation gave us better control over convergence, which resulted in more stable results. Below, we summarized the results of our experiments.

Images below show Ground Truth, Reconstructed Image, Expression Change, and Age Change (Respectively). Lambda refers to the Pn loss coefficient.

Images taken from StyleGan domain.

Iteration = 1300.
lr = 0.01.
lambda = 0.

Iteration = 2000.
lr = 0.01 and decreases by a factor of 0.8 every 200 iteration.
Number of Components = 128 (Instead of 512).
lambda = 0.001 and increases by a factor of 1.15 every 100 iteration.
Every other hyperparamter should follow II2S implementation.

Iteration = 2000.
lr = 0.01 and decreases by a factor of 0.8 every 200 iteration.
Number of Components = 128 (Instead of 512).
lambda = 0.01 and increases by a factor of 1.15 every 100 iteration.
Every other hyperparamter should follow II2S implementation.

Out of Domain Images

1300 iteration. lr = 0.01. lambda = 0. Images below show Ground Truth, Reconstructed Image, Expressin Change, and Age Change (Respectively).

Iteration = 1300.
lr = 0.01.
lambda = 0.0.
Every other hyperparamter should follow II2S implementation.

Iteration = 2000.
lr = 0.01 and decreases by a factor of 0.8 every 200 iteration.
Number of Components = 128 (Instead of 512).
lambda = 0.001 and increases by a factor of 1.15 every 100 iteration.
Every other hyperparamter should follow II2S implementation.

Iteration = 2000.
lr = 0.01 and decreases by a factor of 0.8 every 200 iteration.
Number of Components = 128 (Instead of 512).
lambda = 0.001 and increases by a factor of 1.15 every 100 iteration.
Every other hyperparamter should follow II2S implementation.

Reconstruction Evaluation:

Cases	LPIPS	MSE	MSSSIM
lambda = 0.0	0.397	0.033	0.441
lambda = 0.001, #Components = 128	0.398	0.034	0.461
lambda = 0.01, #Components = 128	0.401	0.0035	0.466
lambda = 0.01, #Components = 64	0.398	0.033	0.460
lambda = 0.1, #Components = 64	0.399	0.034	0.463