GaryGao99

GaryGao99's Stars

NVIDIA/mellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
Language:Jupyter Notebook853184
pshved/timeout
A script to measure and limit CPU time and memory consumption of black-box processes in Linux
Language:Perl48569
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.3k1.4k
momstouch/tdnn_tensorflow
tdnn (time delay neural network) tensorflow implementation
Language:Python94
bliunlpr/DeepSpeaker_PyTorch
Language:Python63
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Language:Jupyter Notebook1.5k339
syang1993/gst-tacotron
A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
Language:Python368110
xcmyz/FastSpeech
The Implementation of FastSpeech based on pytorch.
Language:Python854211
upx/upx
UPX - the Ultimate Packer for eXecutables
Language:C++14.1k1.3k
b04901014/FG-transformer-TTS
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
Language:Python8611
jjery2243542/adaptive_voice_conversion
Language:Python46989
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
Language:Python2.7k420
JaidedAI/EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Language:Python23.6k3.1k
wiseman/py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
Language:C2k403
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python1.9k500
bojone/flow
Keras implement of flow-based models
Language:Python22043
ShivamRajSharma/Transformer-Architectures-From-Scratch
Implementation of transformers based architecture in PyTorch.
Language:Python4818
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
Language:Python957215
JeremyCCHsu/Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
Language:Cython718120
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python6.6k1.2k
Kyubyong/g2p
g2p: English Grapheme To Phoneme Conversion
Language:Python788127
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python132k26.2k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python10.8k1.8k
WeidiXie/VGG-Speaker-Recognition
Utterance-level Aggregation For Speaker Recognition In The Wild
Language:Python36298
MontrealCorpusTools/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
Language:Python1.3k242
Azure-Samples/cognitive-services-speech-sdk
Sample code for the Microsoft Cognitive Services Speech SDK
Language:C#2.8k1.8k
coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Language:C++2.2k267
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.1k6.4k
YoavRamon/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
53387
weiaicunzai/pytorch-cifar100
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet, NasNet, Residual Attention Network, SENet, WideResNet)
Language:Python4.2k1.2k