Nicksooooo

Nicksooooo's Stars

SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python7.4k906
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Language:Python1.1k412
lochenchou/MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
Language:Python34664
v3ucn/CosyVoice_For_Windows
CosyVoice在Windows环境下使用的版本
Language:Python48776
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
Language:Python2.3k188
pengzhendong/streaming-sensevoice
Pseudo Streaming SenseVoice with Hotwords
Language:Python8914
wdndev/llm_interview_note
主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题
Language:HTML3.9k438
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Language:Python3.4k505
sierra-research/tau-bench
Code and Data for Tau-Bench
Language:Python20525
lifeiteng/OmniSenseVoice
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
Language:Python74029
lovemefan/SenseVoice.cpp
Port of Funasr's Sense-voice model in C/C++
Language:C16511
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language:Python9k1.4k
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
Language:Python2.3k423
kaituoxu/Conv-TasNet
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
Language:Python682153
JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Language:Python43677
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Language:Python612115
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python4.4k431
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
Language:C++3.7k427
noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
Language:Go9.4k233
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python135k27.1k
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.2k84
ricky0123/vad
Voice activity detector (VAD) for the browser with a simple API
Language:TypeScript902143
fixie-ai/ultravox
A fast multimodal LLM for real-time voice
Language:Python1.4k88
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python6.4k697
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python3.5k317
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python32.6k3.5k
seanzhang-zhichen/expand-baichuan-tokenizer
扩充百川大模型词表，其他模型也类似
Language:Python6
xai-org/grok-1
Grok open release
Language:Python49.6k8.3k
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Language:Python65349
nickchen121/Pre-training-language-model
博客配套视频链接: https://space.bilibili.com/383551518?spm_id_from=333.1007.0.0 b 站直接看配套 github 链接：https://github.com/nickchen121/Pre-training-language-model 配套博客链接：https://www.cnblogs.com/nickchen121/p/15105048.html
36185