Smoothing the activations at the output of the transformer
zaouk opened this issue · 0 comments
zaouk commented
Hey there,
I was wondering if you encountered any issues related to smoothing the speaker activations predicted using the Transformer model. An encoder only transformer tends to output speaker activations which are not as smooth as the ones provided by other recurrent models (such as Bi-LSTMs and such).
Did you resort to some tricks for smoothing the output activations provided by the Transformer or this was not an issue at all?