Pinned Repositories
AlexK-PL.github.io
AlexMIIS.github.io
Blizzard2013_Segmentation
Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.
GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
GST_Tacotron2
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. The model has been trained with the English read-speech LJSpeech Dataset.
GST_Tacotron2_PitchContourReference
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. Instead of using the whole mel-scale spectrogram representation in the GST input, we extracted and used only the pitch contour in a sparse representation. The model has been trained with the English read-speech LJSpeech Dataset.
Neural_TTS_Tacotron2_pytorch
A pytorch implementation of a Text-to-Speech system based on NVIDIA's Tacotron2 text2mel plus a neural vocoder
tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
Tacotron2-1
A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
Tacotron2_GST_SPM
This is our work on learning speaking style in speech synthesis but only using the pitch frequency sub-band as a speaker reference. We trained a modified version of the NVIDIA's Tacotron2 model but including Global Style Tokens (GST).
AlexK-PL's Repositories
AlexK-PL/GST_Tacotron2
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. The model has been trained with the English read-speech LJSpeech Dataset.
AlexK-PL/GST_Tacotron2_PitchContourReference
A NVIDIA's Pytorch Tacotron2 adaptation with unsupervised Global Style Tokens. Instead of using the whole mel-scale spectrogram representation in the GST input, we extracted and used only the pitch contour in a sparse representation. The model has been trained with the English read-speech LJSpeech Dataset.
AlexK-PL/Neural_TTS_Tacotron2_pytorch
A pytorch implementation of a Text-to-Speech system based on NVIDIA's Tacotron2 text2mel plus a neural vocoder
AlexK-PL/Tacotron2_GST_SPM
This is our work on learning speaking style in speech synthesis but only using the pitch frequency sub-band as a speaker reference. We trained a modified version of the NVIDIA's Tacotron2 model but including Global Style Tokens (GST).
AlexK-PL/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
AlexK-PL/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
AlexK-PL/Tacotron2-1
A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
AlexK-PL/AlexK-PL.github.io
AlexK-PL/AlexMIIS.github.io
AlexK-PL/Blizzard2013_Segmentation
Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.
AlexK-PL/git_course_python
AlexK-PL/Hierarchical-Neural-Autoencoder-1
AlexK-PL/lafrescat-audio-demo
AlexK-PL/marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
AlexK-PL/MaryTTS-Paragraph_patterns
This is an extension of the MaryTTS 5.3 SNAPSHOT version. This includes rule-based and internal post-processing implementations to include paragraph feature patterns.
AlexK-PL/ProsodyModifier
A modifier tool for the open-source MaryTTS platform to insert prosody information coming either from a recording or a statistical model
AlexK-PL/punkProse
Punctuation generation for speech transcripts using lexical and prosodic features
AlexK-PL/Spoon-Knife
This repo is for demonstration purposes only.
AlexK-PL/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
AlexK-PL/waveglow
A Flow-based Generative Network for Speech Synthesis