Daisyqk's Stars
Picovoice/speech-to-text-benchmark
speech to text benchmark framework
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
wenet-e2e/WenetSpeech
A 10000+ hours dataset for Chinese speech recognition
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
JushBJJ/Mr.-Ranedeer-AI-Tutor
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
mapull/chinese-dictionary
中文汉语拼音辞典,汉字拼音字典,词典,成语词典,常用字、多音字字典数据库
BytedanceSpeech/seed-tts-eval
JusperLee/SPMamba
state-spaces/mamba
Mamba SSM architecture
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
fishaudio/fish-speech
Brand new TTS solution
p0p4k/vits2_pytorch
unofficial vits2-TTS implementation in pytorch
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
yeyupiaoling/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Vaibhavs10/fast-whisper-finetuning
fungtion/DANN_py3
python 3 pytorch implementation of DANN
shuaijiang/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
zycv/awesome-keyword-spotting
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
HolgerBovbjerg/data2vec-KWS
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining".
fauxneticien/bnf_cnn_qbe-std
Query by example spoken term detection using bottleneck features and a convolutional neural network