Nicksooooo's Stars
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
lochenchou/MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
v3ucn/CosyVoice_For_Windows
CosyVoice在Windows环境下使用的版本
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
pengzhendong/streaming-sensevoice
Pseudo Streaming SenseVoice with Hotwords
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
sierra-research/tau-bench
Code and Data for Tau-Bench
lifeiteng/OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
lovemefan/SenseVoice.cpp
Port of Funasr's Sense-voice model in C/C++
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
kaituoxu/Conv-TasNet
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
ricky0123/vad
Voice activity detector (VAD) for the browser with a simple API
fixie-ai/ultravox
A fast multimodal LLM for real-time voice
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
2noise/ChatTTS
A generative speech model for daily dialogue.
seanzhang-zhichen/expand-baichuan-tokenizer
扩充百川大模型词表,其他模型也类似
xai-org/grok-1
Grok open release
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
nickchen121/Pre-training-language-model
博客配套视频链接: https://space.bilibili.com/383551518?spm_id_from=333.1007.0.0 b 站直接看 配套 github 链接:https://github.com/nickchen121/Pre-training-language-model 配套博客链接:https://www.cnblogs.com/nickchen121/p/15105048.html