A question about discriminator in paper

Question

A question about discriminator in paper

yuyujunjun opened this issue 4 years ago · 1 comments

Hello @XefPatterson:
I am studying your brilliant work. Followed your paper, I write a pytorch-based code. However, it's hard for me to reproduce your results.
The situation I met
Without the discriminator, training after iterating 75,000 steps shows some reasonable results. (In my experiments, pytorch' RNN has the slower convergence rate than the tensorflow' RNN). But the feet sliding slightly on the ground. So I added the discriminator futher to alleviate this sliding problem. However, my training loss has suddenly increased after adding the discriminator. (At the same time, the discriminator's loss has decreased.)
In your paper, you scale all of your losses to be approximately equal on the LaFAN1 dataset for an untrained network. But I am not sure how to scale the loss of discriminator because the discriminators' losses are different every time(about from 1 to 50).
Discriminator structure
So I suspect that I have written discriminators wrong, which followed by:
Conv1d(input_channel=3+(num_joints-1)*6, output_channel=512, window=10 or 2)->Relu()->Conv1d(512, 256, 1)->Relu()->Conv1d(256, 1, 1),
where 3+(num_joints-1)*6 is the dimension of the vector concatenated by root velocity, all bones' offset related to the root bone and all bones' velocity, the same as your paper said. And the output layer is feed forward layer which means it has not activation layer.
Loss calculation process
The output has the shape of (batch_size, numbers of the sliding window), and the loss is calculated as followed:
loss=mean(square(subtract(output,label)))
I first calculate the loss of generated result and the label 1, and the optimizer step backward to update the generator's weight. And then I calculate the loss of discriminator reuslt, followed by:
loss(generated result,0)+loss(ground truth, 1)

Optimizer
I use the same optimizer and the same loss as the generator for the discriminator.

I stuck in reproducing the similar results as your work for few days. I am very very apperaciate it if you find any mistakes I made or point out any potential problems. Thank you very much!

Answer 1 · 2021-05-04T01:19:35.000Z

Closed this as this is not linked to the dataset itself.
You can write to me detailed questions on the paper at c212.felixh@gmail.com