Training configuration for different models

Question

Training configuration for different models

justinpinkney opened this issue 4 years ago · 3 comments

Hi! Really nice work, the examples are great. I was wondering if you had listed the training configuration somewhere, specifically which resolution layers your froze? I'm also wondering if you did any experiments with freezing different subsets of the modules within each conv?

For context I worked on the "post training" version of this method, i.e. swapping the trained layers form one model into another, see: https://arxiv.org/abs/2010.05334 When I was doing this work I thought that doing something like your freezeG approach might be a more direct way of getting the same effect. I tried a couple of runs myself at the time, but could never get good results (the network failed to learn to generate good images). Curious if you have any insight of things you tried which didn't work?

Answer 1 · 2021-04-27T23:28:28.000Z

Hi! Thanks for your interest. Unfortunately, I did not keep the full configuration of the experiments, but some important parameters that I used are

batch=8, use ADA
face2met: finetune_loc=2, iter=1000
cat2wild: finetune_loc=3, iter=3000

The numbers are not tuned, I simply chose fewer fine-tuning layers and iterations when the source and target domain is close. Freezing different submodules of the model rather than the entire styled-conv layer didn't work well in my case.

I actually tried the layer swapping method similar to yours as well but failed to make it work nicely. In my experience, the problem was the misalignment of input domains for the layers from different models, and I guess the smooth interpolation technique is crucial. (I really like the Disney-style cartoonification results btw)

Answer 2 · 2021-08-08T12:15:18.000Z

@bryandlee Could you tell me the configuration for the face_to_simpson? and I guess you used the default setting for the ADA, right?

Answer 3 · 2021-08-08T12:45:33.000Z

I don't have the exact parameters but it should be something like finetune_loc=3 or 4 and iter > 3000 with the default ADA setting. The dataset can be found here #2 (comment).