syang1993/gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

Python

Issues

training stops many seconds to create new queue of data
#50 opened 3 years ago by Adibian
0
shape of linear_outputs is not same as while training
#49 opened 3 years ago by Mihir-Gajera1
0
Add style weights when there is no reference audio
#48 opened 3 years ago by zh1yu4nyu
0
Using pre-trained model of Keithito's tacotron implementation
#47 opened 3 years ago by hariduttt
0
Regarding the trained model
#46 opened 3 years ago by hariduttt
0
eval form checkpoints
#4 opened 7 years ago by marymirzaei
30
Training with custom data
#23 opened 6 years ago by wanshun123
2
Unable to reproduce results
#44 opened 4 years ago by Anchit1999
0
Pretrained Model
#26 opened 6 years ago by harismuneer
3
Mumbling in synthesis
#45 opened 4 years ago by a-froghyar
1
Style Token Layer implementation question
#1 opened 7 years ago by acetylSv
20
Please update link to Blizzard data
#28 opened 6 years ago by simonkingedinburgh
4
How to achieve style embedding with different weights of each token without reference audio?
#29 opened 6 years ago by bitwangyujia
2
Pretrained Weights
#43 opened 5 years ago by ashish-roopan
1
preprocessing the training data
#2 opened 7 years ago by marymirzaei
2
What is in reference audio path?
#42 opened 5 years ago by Thien223
0
can we synthesis speaker-A's tone with speaker-B's prosody?
#41 opened 5 years ago by niu0717
0
Path for Reference Audio
#38 opened 5 years ago by shrinidhin
1
erro in eval.py
#39 opened 5 years ago by 1105060120
1
Check failed: dnnReLUCreateBackward_F32
#40 opened 5 years ago by miyoungvkim
1
Error in datafeeder.py
#37 opened 5 years ago by shrinidhin
1
poor alignment when conditioned on reference audios
#20 opened 6 years ago by mohsinjuni
4
Why use the 'tf.layer.conv1d' for query, key transformation instead of fully connected layer?
#36 opened 5 years ago by LEEYOONHYUNG
0
where do you insert or import wav file of models voice for training?
#35 opened 5 years ago by pnwseeker
0
Reference Encoder Padding
#34 opened 6 years ago by its-sandy
0
Some problems when preprocessing ljspeech dataset
#33 opened 6 years ago by Charlottecuc
1
Training Multi-Speaker Model.
#21 opened 6 years ago by sujithpadar
5
Throws "data must be floating-point" exception after 1k steps
#25 opened 6 years ago by ishandutta2007
12
No clear speech
#32 opened 6 years ago by ErnstTmp
7
GMM Attention
#31 opened 6 years ago by ErnstTmp
5
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [12,2262,80] vs. [12,2000,80]
#30 opened 6 years ago by ErnstTmp
4
how long will I wait for the Blizzard dataset confirmation?
#27 opened 6 years ago by XiaooaiX
5
Why there is some blank in the sythesized wav file when we use reference audio generation?
#24 opened 6 years ago by begeekmyfriend
0
core dumped error when preprocessing ljspeech dataset
#22 opened 6 years ago by wanshun123
2
Sample Alignment Graph
#10 opened 6 years ago by fazlekarim
4
Train as a Tacotron1 script problem
#15 opened 6 years ago by dazenhom
4
poor alignment when synthesizing long sentences
#19 opened 6 years ago by moonnee
1
Tone transfer
#13 opened 6 years ago by switchzts
4
the model is hard to converge with LJSpeech
#18 opened 6 years ago by zyj008
1
poor alignment with test out-of-collection data
#16 opened 6 years ago by butterl
3
How to integrate this code to r9y9's wavenet_vocoder ?
#14 opened 6 years ago by rishikksh20
13
Eval on soft voices
#17 opened 6 years ago by fazlekarim
0
multi head attention
#12 opened 6 years ago by Young-Sun
2
Preprocessing blizzard 2013 data
#11 opened 6 years ago by jsonko
2
training time
#8 opened 7 years ago by Young-Sun
4
data feeder error
#6 opened 7 years ago by fazlekarim
6
What would happen if we merged datasets?
#5 opened 7 years ago by fazlekarim
3