jinglescode/papers

Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

jinglescode opened this issue · 0 comments

Paper

Link: https://arxiv.org/pdf/1910.10288.pdf
Year: 2019

Summary

  • using simple location-relative attention mechanisms to do away with content-based query/key comparisons, to handle out-of-domain text
  • introduce a new location-relative attention mechanism to the additive energy-based family, called Dynamic Convolution Attention (DCA)

Results

  • Dynamic Convolution Attention and V2 GMM attention with initial bias (GMMv2b) are able to generalize to utterances much longer than those seen during training, while preserving naturalness on shorter utterances
  • improved speed and consistency of alignment during training
  • advantage of DCA over GMM attention is:
  1. DCA can more easily bound its receptive field, which makes it easier to incorporate
    hard windowing optimizations in production
  2. its attention weights are normalized, which helps to stabilize the alignment, especially for coarsegrained alignment tasks

Comments