kollobn

kollobn's Stars

samuela/torch2jax
Run PyTorch in JAX. 🤝
Language:Python2165
nene1212/MaskGCT-Training
Training code for MaskGCT-T2S model.
Language:Python184
DmitryRyumin/ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Language:Python42717
iwangjian/Paper-Reading-ConvAI
📖 Paper reading list in conversational AI (constantly updating 🤗).
993162
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python9.8k949
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Language:Python46464
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Language:Python2.1k145
fishaudio/fish-speech
SOTA Open Source TTS
Language:Python18.5k1.4k
andysingal/llm-course
Language:Jupyter Notebook42745
halsay/ASR-TTS-paper-daily
Update ASR paper everyday
Language:Python1077
Wataru-Nakata/miipher
Unofficial implementation of miipher
Language:Python11516
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek3, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
Language:Python5.1k444
kaistmm/Audio-Mamba-AuM
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
Language:Python12013
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。
Language:Python631110
s-nlp/transformers-course
Materials of transformers lecture course
Language:Jupyter Notebook8614
huggingface/evaluation-guidebook
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
Language:Jupyter Notebook95959
NiuTrans/ABigSurveyOfLLMs
A collection of 150+ surveys on LLMs
23719
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python9.1k1.2k
e2b-dev/awesome-ai-agents
A list of AI autonomous agents
13.3k995
gpt-omni/mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Language:Python1.6k211
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
6.3k346
pytorch/torchtitan
A PyTorch native library for large model training
Language:Python3.2k247
kyutai-labs/moshi
Language:Python7.2k563
yangdongchao/RSTnet
Real-time Speech-Text Foundation Model Toolkit (wip)
Language:Python12611
segment-any-text/wtpsplit
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Language:Python82446
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System
Language:Python53341
EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
64237
aiola-lab/whisper-medusa
Whisper with Medusa heads
Language:Python81850
Lordog/dive-into-llms
《动手学大模型Dive into LLMs》系列编程实践教程
4.2k367
karpathy/LLM101n
LLM101n: Let's build a Storyteller
31.1k1.7k