imxtx's Stars
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
2noise/ChatTTS
A generative speech model for daily dialogue.
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
fishaudio/fish-speech
SOTA Open Source TTS
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
mamoe/mirai
高效率 QQ 机器人支持库
karpathy/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
jiaaro/pydub
Manipulate audio with a simple and easy high level interface
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
google/latexify_py
A library to generate LaTeX expression from Python code.
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
huggingface/parler-tts
Inference and training library for high-quality TTS models.
MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
r9y9/wavenet_vocoder
WaveNet vocoder
mlabonne/llm-datasets
High-quality datasets, tools, and concepts for LLM fine-tuning.
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
microsoft/NeuralSpeech
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
fighting41love/zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。
jianchang512/gptsovits-api
适用于 GPT-SoVITS 的api调用接口
glory20h/VoiceLDM
VoiceLDM: Text-to-Speech with Environmental Context