IIGROUP/TediGAN

compared methods' pretrained model on Multi-Modal-CelebA-HQ Dataset

Closed this issue · 8 comments

Hi, you introduce Multi-Modal-CelebA-HQ Dataset
and compare "AttenGAN, ControlGAN, DFGAN, DM-GAN" in Table 1 of your paper.

Can you provide the link to these compared methods' pre-trained model on Multi-Modal-CelebA-HQ Dataset? Thanks!

Please find the pretrained models here.

Many thanks! Best wishes.

Hi, I try to generate images using the provided model. When I test on AttnGAN:

def build_dictionary(self, train_captions, test_captions):
        word_counts = defaultdict(float)
        captions = train_captions + test_captions
        for sent in captions:
            for word in sent:
                word_counts[word] += 1

        vocab = [w for w in word_counts if word_counts[w] >= 0]

        ixtoword = {}
        ixtoword[0] = '<end>'
        wordtoix = {}
        wordtoix['<end>'] = 0
        ix = 1
        for w in vocab:
            wordtoix[w] = ix
            ixtoword[ix] = w
            ix += 1

        train_captions_new = []
        for t in train_captions:
            rev = []
            for w in t:
                if w in wordtoix:
                    rev.append(wordtoix[w])
            # rev.append(0)  # do not need '<end>' token
            train_captions_new.append(rev)

        test_captions_new = []
        for t in test_captions:
            rev = []
            for w in t:
                if w in wordtoix:
                    rev.append(wordtoix[w])
            # rev.append(0)  # do not need '<end>' token
            test_captions_new.append(rev)

        return [train_captions_new, test_captions_new,
                ixtoword, wordtoix, len(ixtoword)]

the value of len(ixtoword) is 65, not 64 in the provided pretraiend model.
(this inconsistency leads to an error in line 441:

text_encoder.load_state_dict(state_dict)

)
when I change part of codes inside this funcion to:

ixtoword = {}
        # ixtoword[0] = '<end>'
        wordtoix = {}
        # wordtoix['<end>'] = 0
        # ix = 1
        ix = 0
        for w in vocab:
            wordtoix[w] = ix
            ixtoword[ix] = w
            ix += 1

the value of len(ixtoword) is 64, and the program can run. But the result is far from that in TediGAN's paper (Figure 4)
. For exemple:

This man has bags under eyes and big nose. He has no beard.
0_s_0_g2

any tips?

I'm not sure, but based on this issue, adjusting the batch size in the eval_celeba.yml file to 25 might be helpful.

If needed, you can find the captions.pickle file here, and please place it in the data_dir.

It's ok now by downloading the caption.pickle for the link you proved. It's surprising that you are able to remember these details. so kind! Thanks again!
0_s_0_g2

Hi, sorry to disturb you again.
I have generated images from AttnGAN, controlGAN, and DMGAN. But the pre-trained model (netG.pth) you provided on DF-GAN can not be successfully loaded.

The pre-trained model contains block0, block1,... block6, and so on, while the definition inside the DF-GAN's code has nothing to do with this. Although I tried to change DF-GAN's code, I can not recover the codes that your provided model is trained on.

So, I guess you may change codes when you trained DF-GAN on Multi-Modal-CelebA-HQ. Would you like to help me successfully load the pre-trained model you provided?

To successfully load the pretrained model, it may be necessary to use the code version in October 2020 (please refer to the commit history of DF-GAN; actually, any version before June 21, 2022, should work).

Considering this, I recommend retraining the model using the latest codes to ensure a fair comparison.

I see. Thanks for your response.
image