WanCaiYan

Pinned Repositories

accelerate
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
Language:Python0 0 00
ai-deployment
关注AI模型上线、模型部署
Language:Jupyter Notebook0 0 00
audio-SNR
Mixing an audio file with a noise file at any Signal-to-Noise Ratio (SNR)
Language:Python0 0 00
audio_diarization_annotation
Audio Diarization Annotation tool
Language:JavaScript0 0 00
auorange
Audio LPC (linear prediction code) using mel spectorgram, compatible for LPCNet
Language:Python0 0 00
AutoSpeech
[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang
Language:Python0 0 00
awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
0 0 00
BVAE-TTS
Official implementation of BVAE-TTS
Language:Python0 0 00
chatbot-list
行业内关于智能客服、聊天机器人的应用和架构、算法分享和介绍
0 0 00
espnet_tts_frontend
Text frontend for ESPnet tts recipes
Language:Python1 0 00

WanCaiYan's Repositories

WanCaiYan/espnet_tts_frontend
Text frontend for ESPnet tts recipes
Language:Python1 0 00
WanCaiYan/ai-deployment
关注AI模型上线、模型部署
Language:Jupyter Notebook0 0 00
WanCaiYan/Chinese_Mandarin_TTS_Mayebe_v1
Language:Python0 0 00
WanCaiYan/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Language:C++0 0
WanCaiYan/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换)
Language:Python0 0
WanCaiYan/DiDiSpeech
Language:HTML0 0
WanCaiYan/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
WanCaiYan/HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python0 0
WanCaiYan/icassp2021-emotion-tts
Language:Python0 0
WanCaiYan/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript0 0
WanCaiYan/langid.py
Stand-alone language identification system
Language:Python0 0
WanCaiYan/LeetCodeAnimation
Demonstrate all the questions on LeetCode in the form of animation.（用动画的形式呈现解LeetCode题目的思路）
Language:Java0 0
WanCaiYan/line_profiler
(OLD REPO) Line-by-line profiling for Python - Current repo ->
Language:Python0 0
WanCaiYan/MS-Tacotron2
Tacotron2 based multi-speaker text to speech
Language:Jupyter Notebook0 0
WanCaiYan/P.808
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
Language:HTML0 0
WanCaiYan/papers-with-annotations
Research papers with annotations, illustrations and explanations
0 0
WanCaiYan/PPSpeech
PPSpeech: Phrase based Parallel End-to-End TTS System
WanCaiYan/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
WanCaiYan/Shenlan-ASR-Course
深蓝学院语音课程《语音识别从入门到精通》课程作业
0 0
WanCaiYan/speech-synthesis-paper
List of speech synthesis papers.
0 0
WanCaiYan/speedyspeech
Language:Python0 0
WanCaiYan/SqueezeFlow
Code Repository for "SqueezeFlow: Adaptive Text-to-Speech in Low Computational Resource Scenarios"
WanCaiYan/tacotron2
Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2
Language:Python0 0
WanCaiYan/Tacotron2_batch_inference
Pytorch tacotron2 that can be used to perform batch inference
Language:Python0 0
WanCaiYan/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese and Easy to adapt for other languages)
Language:Python0 0
WanCaiYan/Voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
Language:Python0 0
WanCaiYan/WavAugment
A library for speech data augmentation in time-domain
WanCaiYan/wavegrad
A fast, high-quality neural vocoder.
Language:Python0 0
WanCaiYan/wavelet_prosody_toolkit
Language:Python0 0
WanCaiYan/zhvoice
Chinese voice corpus. 中文语音语料，语音更加清晰自然，包含8个开源数据集，3200个说话人，900小时语音，1300万字。
0 0