Pinned Repositories
acora
Fast multi-keyword search engine for text strings
add-noise
add noise of a certain SNR to audio files
addnoise
script and ubuntu executable to add noise to the given wavs
AEGAN-AD
Official pytorch implementation of AEGAN-AD
AIPC
aliyun-oss-python-sdk
Aliyun OSS SDK for Python
audio-convolutional-neural-network
convolutional neural network for audio (Audio to Image)
Audio2Spectrogram
This tool can be used to convert mp3 to processable wav files, generate chunks of wav's and generate spectrograms.
kws
An End-to-End Architecture for Keyword Spotting and Voice Activity Detection
Speaker-Diarization
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
v-yunbin's Repositories
v-yunbin/AEGAN-AD
Official pytorch implementation of AEGAN-AD
v-yunbin/AIPC
v-yunbin/bert4torch
pytorch implement of transformers refer to bert4keras
v-yunbin/Bert-VITS2-Integration-train-txt-infer
适配windows的requirements.txt,加了个长文本分段推理和手机听书的api,非本专业,屎山代码
v-yunbin/emotion-finetuning-vits
v-yunbin/emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
v-yunbin/FastASR
这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 推理模型是基于目前最先进的conformer模型,使用10000+小时的wenetspeech数据集训练得到, 所以识别效果也很好,可以媲美许多商用的ASR软件。
v-yunbin/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
v-yunbin/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
v-yunbin/Genshin_Datasets
Genshin Datasets For SVC/SVS/TTS
v-yunbin/GenshinVoice
Voice dataset of Genshin Impact 原神语音数据集
v-yunbin/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
v-yunbin/icefall
v-yunbin/MoeGoe
Executable file for VITS inference
v-yunbin/PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
v-yunbin/Paddlespeech-Streaming-ASR-GUI
v-yunbin/porcupine
On-device wake word detection powered by deep learning
v-yunbin/pykaldi
A Python wrapper for Kaldi
v-yunbin/realtime-vad-sample
Sample code of real-time voice activity detection using webrtcvad.
v-yunbin/RealtimeSTT
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription. Designed for real-time applications like voice assistants.
v-yunbin/Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
v-yunbin/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
v-yunbin/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
v-yunbin/travel-chatbot
This project implements a travel chatbot powered by the RAG (Retrieve and Generate) chain, providing real-time information retrieval using various tools and the ability to fetch weather reports.
v-yunbin/vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
v-yunbin/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
v-yunbin/vits_chinese
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!
v-yunbin/VOSk-ASR-GUI-Demo-
v-yunbin/Whisper-Finetune
微调Whisper模型和加速推理
v-yunbin/whisper-finetuning
[WIP] Scripts for fine-tuning Whisper