Pinned Repositories
asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
bss
工学博覧会 : 音源分離チーム
caffe
Caffe: a fast open framework for deep learning.
caffe-tensorflow
Caffe models in TensorFlow
CapsNet-Tensorflow
A Tensorflow implementation of CapsNet(Capsules Net) in Hinton's paper Dynamic Routing Between Capsules
capsule-networks
A Tensorflow implementation of Capsule Networks
CGMM-MVDR
Implementation of the CGMM-MVDR beamforming
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
wyn314's Repositories
wyn314/asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
wyn314/Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
wyn314/bss
工学博覧会 : 音源分離チーム
wyn314/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
wyn314/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
wyn314/Conv-TasNet
Deep Neural Network for Speaker Separation
wyn314/FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
wyn314/Forward
A library for high performance deep learning inference on NVIDIA GPUs.
wyn314/jhu-neural-wpe
Neural Dereverberation
wyn314/LPCNet
Efficient neural speech synthesis
wyn314/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
wyn314/MS-SNSD
The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.
wyn314/onssen
An open-source speech separation and enhancement library
wyn314/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
wyn314/pase
Problem Agnostic Speech Encoder
wyn314/pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
wyn314/Python
All Algorithms implemented in Python
wyn314/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
wyn314/resemble-enhance
AI powered speech denoising and enhancement
wyn314/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
wyn314/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
wyn314/speech-dereverberation
speech-dereverberation-using-GANs
wyn314/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
wyn314/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
wyn314/tacotron2-1
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
wyn314/TasNet-tensorflow
A tensorflow implementation of TasNet (ICASSP 2018)
wyn314/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
wyn314/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
wyn314/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
wyn314/waveglow
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis