bfs18

bfs18's Stars

THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.9k940
janhq/ichigo
Local realtime voice AI
Language:Python2.1k110
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.5k4.3k
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Language:Python1.2k64
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.8k228
AnswerDotAI/gpu.cpp
A lightweight library for portable low-level GPU computation using WebGPU.
Language:C++3.8k176
necludov/wl-mechanics
Language:Jupyter Notebook311
EurekaLabsAI/micrograd
The Autograd Engine
Language:HTML54852
EurekaLabsAI/ngram
The n-gram Language Model
Language:C1.4k92
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
Language:Python63465
facebookresearch/Qinco
Residual Quantization with Implicit Neural Codebooks
Language:Python492
UbiquitousLearning/mllm
Fast Multimodal LLM on Mobile Devices
Language:C++62269
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.6k5k
microsoft/T-MAC
Low-bit LLM inference on CPU with lookup table
Language:C++63048
lucidrains/voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Language:Python62353
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python8k601
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language:Python923111
test-time-training/ttt-lm-pytorch
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Language:Python1.1k64
Delgan/loguru
Python logging made (stupidly) simple
Language:Python20.4k707
DarkAutumn/triforce
A deep learning agent for The Legend of Zelda (nes)
Language:Python172
winddori2002/DEX-TTS
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
Language:Python958
KwaiVGI/LivePortrait
Bring portraits to life!
Language:Python13.4k1.4k
spyoungtech/FreeSimpleGUI
The free-forever GUI library
Language:Python39848
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python8.8k848
gcorso/disco-diffdock
Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024
Language:Python752
fishaudio/fish-speech
SOTA Open Source TTS
Language:Python17.7k1.3k
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
79448
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Language:Python1.8k126
karpathy/LLM101n
LLM101n: Let's build a Storyteller
30.7k1.7k
bojone/FSQ
Keras implement of Finite Scalar Quantization
Language:Python675