Pinned Repositories
100DaysOfSystemDesign
Documenting resources and notes for learning system design.
anime_style_transfer_pytorch
anime style transfer with pytorch
animegan2-pytorch
PyTorch implementation of AnimeGANv2
ASR-with-DFCNN-and-Transformer
Speech Recognition with DFCNN and Transformer
awesome-computer-vision
A curated list of awesome computer vision resources
awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
FinRL
FinRL: Financial Reinforcement Learning Framework. Please star. 🔥
JointCapPunc
Official pytorch implementation of Vietnamese Capitalization and Punctuation Recovery Models
VB-TTS-datasets
A Vietnamese professional female accent dataset for speech synthesis tasks
ductho9799's Repositories
ductho9799/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
ductho9799/PyTorch-GAN
PyTorch implementations of Generative Adversarial Networks.
ductho9799/Cross-Lingual-Voice-Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
ductho9799/Forward
A library for high performance deep learning inference on NVIDIA GPUs.
ductho9799/vietpunc
Vietnamese Punctuation Prediction using Pretrained Language Models
ductho9799/SpeechTransProgress
Tracking the progress in end-to-end speech translation
ductho9799/WavAugment
A library for speech data augmentation in time-domain
ductho9799/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
ductho9799/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
ductho9799/edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
ductho9799/WaveGrad-1
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
ductho9799/video-streamer
A realtime object detector that streams video and text to a Flask-SocketIO server.
ductho9799/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
ductho9799/wavegrad
A fast, high-quality neural vocoder.
ductho9799/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
ductho9799/speedyspeech
ductho9799/vPhon
A Vietnamese phonetizer
ductho9799/espnet
End-to-End Speech Processing Toolkit
ductho9799/DisVoice
feature extraction from speech signals
ductho9799/ecg-classification
ductho9799/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
ductho9799/DeepLearningExamples
Deep Learning Examples
ductho9799/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)
ductho9799/BVAE-TTS
Official implementation of BVAE-TTS
ductho9799/OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
ductho9799/StreamingTransformer
ductho9799/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
ductho9799/flask-music-streaming
A simple Flask app for streaming music
ductho9799/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
ductho9799/nlp-in-practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.