export_weights.py works on 1024 and 512 res, but not on 256

Question

export_weights.py works on 1024 and 512 res, but not on 256

mctrinkle opened this issue 4 years ago · 12 comments

I have been able to reproduce this colab's SG2_ADA_PT_to_Rosinality.ipynb test, but it seems to only work with 1024 and 512 resolutions.

It fails for all the 256 resolution pkl found here: transfer-learning-source-nets

This is the error I get for lsundog-res256-paper256-kimg100000-noaug.pkl -
Full Traceback
Traceback (most recent call last): File "generate.py", line 76, in <module> g_ema.load_state_dict(checkpoint["g_ema"]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1224, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Generator: Missing key(s) in state_dict: "convs.12.conv.weight", "convs.12.conv.blur.kernel", "convs.12.conv.modulation.weight", "convs.12.conv.modulation.bias", "convs.12.noise.weight", "convs.12.activate.bias", "convs.13.conv.weight", "convs.13.conv.modulation.weight", "convs.13.conv.modulation.bias", "convs.13.noise.weight", "convs.13.activate.bias", "to_rgbs.6.bias", "to_rgbs.6.upsample.kernel", "to_rgbs.6.conv.weight", "to_rgbs.6.conv.modulation.weight", "to_rgbs.6.conv.modulation.bias", "noises.noise_13", "noises.noise_14"

Any idea how to fix this? I believe the model architecture is different for the smaller models.

On top of that issue I did notice that n_mapping=8 and n_layers=7 for sundog-res256-paper256-kimg100000-noaug.pkl, but my custom 256 trained model using the original repo has n_mapping=2 and n_layers=7 and gets converted without error in export_weights.py but produces this error, along with the one above when running stylegan2-pytorch/python generate.py --size 256
Missing key(s) in state_dict: "style.3.weight", "style.3.bias", "style.4.weight", "style.4.bias", "style.5.weight", "style.5.bias", "style.6.weight", "style.6.bias", "style.7.weight", "style.7.bias", "style.8.weight", "style.8.bias".

Why is this different than the sample 256 models?

Answer 1 · 2021-04-07T12:39:15.000Z

I am getting a similar error trying to use this with the new restyle-encoder: yuval-alaluf/restyle-encoder#1 I am getting the same error trying to train the psp encoder.
My model is 512 resolution though.

Answer 2 · 2021-04-14T15:57:43.000Z

I managed to fix this issue by changing the number of mapping layers in stylegan2-pytorch/generate.py . As you mentioned, rosinality fork has a hardcoded number of mapping layers
args.n_mlp = 8
Changing it to your models n_mapping should fix the problem. Another solution is to add both n_mlp and latent to command-line arguments:

    parser.add_argument(
        "--n_mlp",
        type=int,
        default=2,
        help="number of mapping layers",
    )
    parser.add_argument(
        "--latent",
        type=int,
        default=512,
        help="dimensionality of mapping layer",
    )

You can also just clone my fork of stylegan2-pytorch

Answer 3 · 2021-06-20T05:40:09.000Z

This likely happens if you use the auto config in the the ADA repo (my suggestion is to never use the auto config, its...not great). As mentioned by @mycodeiscat this will create a different count in the mapping layers. I’ll try to add some additional arguments to the export script to cover this issue this week.

Answer 4 · 2021-06-23T13:16:00.000Z

My export weights code report a bug when deal with the ffhq-res1024-mirror-stylegan2-noaug.pkl:
Traceback (most recent call last):
File "export_weights.py", line 82, in
convert()
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "export_weights.py", line 50, in convert
G_nvidia = legacy.load_network_pkl(f)["G_ema"]
File "/content/stylegan2-ada-pytorch/legacy.py", line 23, in load_network_pkl
data = _LegacyUnpickler(f).load()
_pickle.UnpicklingError: pickle data was truncated

Hope to receive your kindly help. Thanks!
@dvschultz @mctrinkle

Answer 5 · 2021-06-27T00:59:37.000Z

@MissDores looks like your file wasn’t completely downloaded

Answer 6 · 2021-11-28T22:28:45.000Z

I managed to fix this issue by changing the number of mapping layers in stylegan2-pytorch/generate.py . As you mentioned, rosinality fork has a hardcoded number of mapping layers args.n_mlp = 8 Changing it to your models n_mapping should fix the problem. Another solution is to add both n_mlp and latent to command-line arguments:
    parser.add_argument(
        "--n_mlp",
        type=int,
        default=2,
        help="number of mapping layers",
    )
    parser.add_argument(
        "--latent",
        type=int,
        default=512,
        help="dimensionality of mapping layer",
    )
You can also just clone my fork of stylegan2-pytorch

I've tried this solution. However, it doesn't work for my case. I trained StyleGAN-ADA 256x256 in the conditional setting on a custom dataset. I also used auto config. I'm confused because of this issue. The problem below is because of auto config or conditional setting? What do you think? @dvschultz @mycodeiscat

Traceback (most recent call last): File "generate.py", line 85, in <module> g_ema.load_state_dict(checkpoint["g_ema"]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Generator: size mismatch for style.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]).

Answer 7 · 2021-12-01T17:31:32.000Z

@hozfidan93 because of the auto config. There's something wrong with it.

Answer 8 · 2021-12-11T03:34:31.000Z

How do I configure cfg when learning stylegan2-ada with a custom dataset? In general, I use auto, but in that case, I am in trouble because it cannot generate.

Answer 9 · 2021-12-13T07:18:10.000Z

이것은 autoADA 리포지토리에서 구성 을 사용하는 경우 발생할 수 있습니다 (제 제안은 자동 구성을 사용하지 않는 것입니다. 그다지 좋지는 않습니다). @mycodeiscat에서 언급 했듯이 이것은 매핑 레이어에서 다른 개수를 생성합니다. 이번 주에 이 문제를 다루기 위해 내보내기 스크립트에 몇 가지 추가 인수를 추가하려고 합니다.

I did what you suggested. As a result of training with a custom dataset of 256*256 size and cfg-paper256, generate.py does not work properly.

Answer 10 · 2022-01-09T03:42:59.000Z

Just wanted to make a note that a similar error can occur if the size argument is incorrect. Instead of 'Missing key(s)', there are unexpected keys.

Answer 11 · 2022-01-31T18:56:01.000Z

I had the size mismatch issue and was able to solve it using --channel_multiplier 1 instead of the default 2.

Answer 12 · 2023-04-15T08:27:05.000Z

Where is --channel_multiplier 1, please？