Daisyqk

Daisyqk's Stars

Picovoice/speech-to-text-benchmark
speech to text benchmark framework
Language:Python60864
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Language:Python42960
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python6.2k659
wenet-e2e/WenetSpeech
A 10000+ hours dataset for Chinese speech recognition
Language:Shell49448
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，LLaMA等模型应用在纠错场景，开箱即用。
Language:Python5.5k1.1k
JushBJJ/Mr.-Ranedeer-AI-Tutor
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
28.6k3.3k
mapull/chinese-dictionary
中文汉语拼音辞典，汉字拼音字典，词典，成语词典，常用字、多音字字典数据库
457109
BytedanceSpeech/seed-tts-eval
Language:Python95797
JusperLee/SPMamba
Language:Python11815
state-spaces/mamba
Mamba SSM architecture
Language:Python12.8k1.1k
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell8.7k546
fishaudio/fish-speech
Brand new TTS solution
Language:Python13k967
p0p4k/vits2_pytorch
unofficial vits2-TTS implementation in pytorch
Language:Python47785
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16k1.6k
yeyupiaoling/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Language:C836133
Vaibhavs10/fast-whisper-finetuning
Language:Jupyter Notebook44037
fungtion/DANN_py3
python 3 pytorch implementation of DANN
Language:Python50797
shuaijiang/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Language:C1728
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python69k8.1k
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
Language:Shell1k83
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
Language:C++3.3k380
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Language:Python7.6k755
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language:Python7.3k625
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python4.6k577
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python33.7k3.9k
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
Language:Python48779
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python4.5k387
zycv/awesome-keyword-spotting
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
23938
HolgerBovbjerg/data2vec-KWS
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining".
Language:Python255
fauxneticien/bnf_cnn_qbe-std
Query by example spoken term detection using bottleneck features and a convolutional neural network
Language:Python81