Pinned Repositories
accelerate
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
ai-deployment
关注AI模型上线、模型部署
audio-SNR
Mixing an audio file with a noise file at any Signal-to-Noise Ratio (SNR)
audio_diarization_annotation
Audio Diarization Annotation tool
auorange
Audio LPC (linear prediction code) using mel spectorgram, compatible for LPCNet
AutoSpeech
[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang
awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
BVAE-TTS
Official implementation of BVAE-TTS
chatbot-list
行业内关于智能客服、聊天机器人的应用和架构、算法分享和介绍
espnet_tts_frontend
Text frontend for ESPnet tts recipes
WanCaiYan's Repositories
WanCaiYan/espnet_tts_frontend
Text frontend for ESPnet tts recipes
WanCaiYan/ai-deployment
关注AI模型上线、模型部署
WanCaiYan/Chinese_Mandarin_TTS_Mayebe_v1
WanCaiYan/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
WanCaiYan/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
WanCaiYan/DiDiSpeech
WanCaiYan/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
WanCaiYan/HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
WanCaiYan/icassp2021-emotion-tts
WanCaiYan/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
WanCaiYan/langid.py
Stand-alone language identification system
WanCaiYan/LeetCodeAnimation
Demonstrate all the questions on LeetCode in the form of animation.(用动画的形式呈现解LeetCode题目的思路)
WanCaiYan/line_profiler
(OLD REPO) Line-by-line profiling for Python - Current repo ->
WanCaiYan/MS-Tacotron2
Tacotron2 based multi-speaker text to speech
WanCaiYan/P.808
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
WanCaiYan/papers-with-annotations
Research papers with annotations, illustrations and explanations
WanCaiYan/PPSpeech
PPSpeech: Phrase based Parallel End-to-End TTS System
WanCaiYan/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
WanCaiYan/Shenlan-ASR-Course
深蓝学院语音课程《语音识别从入门到精通》课程作业
WanCaiYan/speech-synthesis-paper
List of speech synthesis papers.
WanCaiYan/speedyspeech
WanCaiYan/SqueezeFlow
Code Repository for "SqueezeFlow: Adaptive Text-to-Speech in Low Computational Resource Scenarios"
WanCaiYan/tacotron2
Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2
WanCaiYan/Tacotron2_batch_inference
Pytorch tacotron2 that can be used to perform batch inference
WanCaiYan/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese and Easy to adapt for other languages)
WanCaiYan/Voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
WanCaiYan/WavAugment
A library for speech data augmentation in time-domain
WanCaiYan/wavegrad
A fast, high-quality neural vocoder.
WanCaiYan/wavelet_prosody_toolkit
WanCaiYan/zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。