Pinned Repositories
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Alice_split_toolset
Split audio using the .srt file, clean up annotations, then merge and package into a format suitable for bert-vits2 in a standard manner. 使用.srt文件分割音频并清洗标注,合并封装至适用于bert-vits2的一个较为标准的格式
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audio-preprocess
Preprocess Audio for training
auto-VITS-DataLabeling
Simple data labeling script with funasr inside. 使用阿里fanasr进行VITS训练数据标注
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
awesome-resume
Resume,Resume Templates,程序员简历例句,简历模版,
BBDown
Bilibili Downloader. 一款命令行式哔哩哔哩下载器.
bert
TensorFlow code and pre-trained models for BERT
WeTextProcessing
Text Normalization & Inverse Text Normalization
clumsyroot's Repositories
clumsyroot/audio-preprocess
Preprocess Audio for training
clumsyroot/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
clumsyroot/BBDown
Bilibili Downloader. 一款命令行式哔哩哔哩下载器.
clumsyroot/Bert-VITS2
vits2 backbone with multilingual-bert
clumsyroot/chinese-dictionary
中文汉语拼音辞典,汉字拼音字典,词典,成语词典,常用字、多音字字典数据库
clumsyroot/WeTextProcessing
Text Normalization & Inverse Text Normalization
clumsyroot/ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
clumsyroot/ChatTTS
TTS
clumsyroot/CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
clumsyroot/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
clumsyroot/CUDA-Learn-Notes
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
clumsyroot/espnet
End-to-End Speech Processing Toolkit
clumsyroot/fish-diffusion
An easy to understand TTS / SVS / SVC framework
clumsyroot/frp
A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.
clumsyroot/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
clumsyroot/GPT-SoVITS
1 mins voice data can also be used to train a good TTS model!
clumsyroot/languagecodec
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
clumsyroot/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
clumsyroot/megatts2
Unoffical implementation of Megatts2
clumsyroot/melfusion
clumsyroot/MoeVoiceStudio
一个使用C++编写的音频处理软件
clumsyroot/nendo
The Nendo AI Audio Tool Suite
clumsyroot/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
clumsyroot/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
clumsyroot/pythonic-project-guidelines
Set of guidelines and structure of a Python project.
clumsyroot/ReFlow-VAE-SVC
clumsyroot/SenseVoice
Multilingual Voice Understanding Model
clumsyroot/tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
clumsyroot/VITS-Batched-Inference
Efficient batched inference scripts for VITS TTS model, supporting single-process and multiprocessing modes.
clumsyroot/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis