cuichenrui2000
๐ Master's student diving into Deep Learning and Speech Processing! ๐ AI enthusiast exploring the world of sound. Aspiring speech recognition expert!๐๐๐
Tianjin UniversityBeijing, China
Pinned Repositories
Amphion
Amphion (/รฆmหfaษชษn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Awesome-Speech-Language-Model
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
barry_speech_tools
This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets related to environment setup, data preprocessing, speech frontend, speech recognition, voice conversion, speech synthesis, and more. Let's explore the fascinating world of speech processing together! ๐๐๐
ChatTTS
A generative speech model for daily dialogue.
ChenruiCui.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
faster-whisper
Faster Whisper transcription with CTranslate2
Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
whisper.cpp
Port of OpenAI's Whisper model in C/C++
whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
cuichenrui2000's Repositories
cuichenrui2000/barry_speech_tools
This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets related to environment setup, data preprocessing, speech frontend, speech recognition, voice conversion, speech synthesis, and more. Let's explore the fascinating world of speech processing together! ๐๐๐
cuichenrui2000/Amphion
Amphion (/รฆmหfaษชษn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
cuichenrui2000/Awesome-Speech-Language-Model
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
cuichenrui2000/ChatTTS
A generative speech model for daily dialogue.
cuichenrui2000/ChenruiCui.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
cuichenrui2000/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
cuichenrui2000/dns_mos_calculate
Code for calculate DNS_MOS.
cuichenrui2000/faster-whisper
Faster Whisper transcription with CTranslate2
cuichenrui2000/GitHub-Chinese-Top-Charts
:cn: GitHubไธญๆๆ่กๆฆ๏ผๅ่ฏญ่จๅ่ฎพใ่ฝฏไปถ | ่ตๆใๆฆๅ๏ผ็ฒพๅๅฎไฝไธญๆๅฅฝ้กน็ฎใๅๅๆ้๏ผ้ซๆๅญฆไน ใ
cuichenrui2000/gss
A simple package for Guided source separation (GSS)
cuichenrui2000/ICMC-ASR_Baseline
The baseline system for the ICASSP2024 ICMC-ASR Challenge.
cuichenrui2000/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
cuichenrui2000/whisper.cpp
Port of OpenAI's Whisper model in C/C++
cuichenrui2000/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
cuichenrui2000/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
cuichenrui2000/MindSpore4Speech
cuichenrui2000/moshi
cuichenrui2000/open-speech-data
๐ A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
cuichenrui2000/peft
๐ค PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
cuichenrui2000/PM-EVC
This is the official implement of A Controllable Emotion Voice Conversion Framework with Pre-trained Speech Representations
cuichenrui2000/pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
cuichenrui2000/pytorch-book
PyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (ใๆทฑๅบฆๅญฆไน ๆกๆถPyTorch๏ผๅ ฅ้จไธๅฎๆใ)
cuichenrui2000/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
cuichenrui2000/Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
cuichenrui2000/RUI_SE
The official repo of "A Refining Underlying Information Framework for Speech Enhancement"
cuichenrui2000/SCTK
cuichenrui2000/SenseVoice
Multilingual Voice Understanding Model
cuichenrui2000/so-vits-svc
SoftVC VITS Singing Voice Conversion
cuichenrui2000/StreamVC
An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".
cuichenrui2000/UER-py
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo