jxzhanggg/nonparaSeq2seqVC_code

gate outputs?

JRMeyer opened this issue · 5 comments

there are many mentions of gate in the code, for example:

gate_outputs: gate outputs from the decoder

What is this gate, and what does it do?

also, here are gate_targets:

gate_target [batch_size, T]

gate output of decoder is a flag indicating whether the decoding process has reach the last step.
If the decoding has finished, gate output should be >0.5, otherwise, it should be < 0.5.
The second paragraph of Section-III-D of our paper says:

In order to end the acoustic feature sequences generated
by the seq2seq decoder at the conversion stage, the hidden
state of the seq2seq decoder at each frame is projected to
a scalar followed by sigmoid activation to predict whether
current frame is the last frame in an utterance. Accordingly,
a cross entropy loss $L_{ED}$ is defined for this prediction at the
training stage.

I think that gate should be changed to another word, because gate_output is ambiguous between "the outputs of a gate" and "gate the outputs", and the word gate is overloaded in machine learning.

how about is_final_frame?

Yes, it's a good suggestion and I think it'll be more clear.
By the way, the name "gate_output" actually inherited from the Tacotron 2 repo of NVIDIA.

Because "is_final_frame" is a bit long, I've replace the "gate" with "stop" in commit f8320cd