actuy's Stars
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
jhao104/proxy_pool
Python ProxyPool for web spider
soumith/ganhacks
starter from "How to Train a GAN?" at NIPS2016
facebookresearch/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
openai/jukebox
Code for the paper "Jukebox: A Generative Model for Music"
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
bytedance/GiantMIDI-Piano
bytedance/piano_transcription
lowerquality/gentle
gentle forced aligner
MontrealCorpusTools/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
YannickJadoul/Parselmouth
Praat in Python, the Pythonic way
guoday/Tencent2020_Rank1st
The code for 2020 Tencent College Algorithm Contest, and the online result ranks 1st.
Tonejs/Midi
Convert MIDI into Tone.js-friendly JSON
yy1lab/Lyrics-Conditioned-Neural-Melody-Generation
santi-pdp/segan_pytorch
Speech Enhancement Generative Adversarial Network in PyTorch
HLTSingapore/Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
music-x-lab/POP909-Dataset
This is the dataset repository for the paper: POP909: A Pop-song Dataset for Music Arrangement Generation
danmic/av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
hsinyuan-huang/FlowQA
Implementation of conversational QA model: FlowQA (with slight improvement)
JusperLee/Looking-to-Listen-at-the-Cocktail-Party
Executable code based on Google articles
JeremyCCHsu/vqvae-speech
Tensorflow implementation of the speech model described in Neural Discrete Representation Learning (a.k.a. VQ-VAE)
iamyuanchung/speech2vec-pretrained-vectors
Speech2vec pre-trained word vectors
DDMAL/jSymbolic2
2nd Version of jSymbolic
cifkao/ismir2019-music-style-translation
The code for the ISMIR 2019 paper “Supervised symbolic music style translation using synthetic data”.
meelement/noise_adversarial_tacotron
Reproduction of paper: Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization
isl-mt/fluent-fisher
eastonYi/Unsupervised-ASR
unsupervised ASR (mainly phone classifier) using EODM and GAN