lukedeo/keras-acgan

Question : Output on softmax with 10 units but in train cycle uses one number ?

Closed this issue · 3 comments

rjpg commented

Hello,

I was loking to the code and congrats! It looks good and it is what I was looking for. There are not many examples with AC-GANs.

There is one part I did not understand in total.

Normally when doing classification we have a softmax output with size equal to the number of classes (like here in the discrimator softmax out layer).
And with this we have to perform hot encoding to transform the label into a distribution of probabilities to be "compatible" with the softmax layer... like class 0 would be the vector [1,0,0,0,0,0,0,0,0,0] class 1 - [0,1,0,0,0,0,0,0,0,0] ...

On the train you give the class directly with out hot encoding in :

 epoch_gen_loss.append(combined.train_on_batch(
                [noise, sampled_labels.reshape((-1, 1))], [trick, sampled_labels]))

the var sampled_labels has the number of the class directly (with out hot encoding) but the model has 10 units for this variable :

aux = Dense(10, activation='softmax', name='auxiliary')(features)
return Model(input=image, output=[fake, aux])

The code works so, What I am I missing ?

Hot encoding is not necessary or is there but I did not see ?

thanks

If you look here you can see sparse_categorical_crossentropy, which uses the integer rather than one-hot encoding.

rjpg commented

Hello,

Thanks a lot. I did not notice that detail (never use the "sparse" version :-) )

just one other thing to confirm:

When creating the input for the generator, the label is used to "give color to noise" so it can guide the "latent" to the creation of the right class, correct ? this is done multiplying the random noise with the label, right ?

with this code :

 # this is the z space commonly refered to in GAN papers
    latent = Input(shape=(latent_size, ))

    # this will be our label
    image_class = Input(shape=(1,), dtype='int32')

    # 10 classes in MNIST
    cls = Flatten()(Embedding(10, latent_size,
                              init='glorot_normal')(image_class))

    # hadamard product between z-space and a class conditional embedding
    h = merge([latent, cls], mode='mul')  ## <-- HERE 

    fake_image = cnn(h)

    return Model(input=[latent, image_class], output=fake_image)

That is correct, yes. This isn't canonical (if i'm not mistaken, the original paper just adds a one-hot vector to the latent space) but I found this to be more intuitive & to work a bit better.