bfs18

bfs18's Stars

yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
Language:Python94.3k 533 8.3k7.4k
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.5k 330 4474.3k
shadps4-emu/shadPS4
PS4 emulator for Windows,Linux,MacOS
Language:C++11.6k 134 653780
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.9k 127 478940
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python8.1k 110 158482
kyutai-labs/moshi
Language:Python7k 80 89550
yangjianxin1/Firefly
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Language:Python6k 55 281534
facebookresearch/sapiens
High-resolution models for human tasks.
Language:Python4.7k 45 164268
livekit/agents
Build real-time multimodal AI applications 🤖🎙️📹
Language:Python4.4k 52 359503
AnswerDotAI/gpu.cpp
A lightweight library for portable low-level GPU computation using WebGPU.
Language:C++3.8k 47 24176
TEN-framework/TEN-Agent
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compatible with popular workflow platforms like Dify and Coze.
Language:Python3.7k 42 168362
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language:Python3.6k 44 91389
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.8k 44 198229
VadimBoev/FlappyBird
Less than 100 Kilobytes. Works for Android 5.1 and above
Language:C2.2k 11 30139
homebrewltd/ichigo
Local realtime voice AI
Language:Python1.9k 19 6991
EurekaLabsAI/ngram
The n-gram Language Model
Language:C1.4k 51 092
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Language:Python1.2k 18 7364
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Language:Python1.1k 40 5764
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
Language:Python634 13 6265
EurekaLabsAI/micrograd
The Autograd Engine
Language:HTML545 14 151
hubertsiuzdak/snac
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Language:Python460 7 2426
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
Language:Python394 26 2336
THUDM/CogView3
text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
Language:Python271 14 1218
EmilianPostolache/stable-audio-controlnet
Fine-tune Stable Audio Open with DiT ControlNet.
Language:Python187 4 55
bfs18/rfwave
Language:Python111 4 58
yangdongchao/Open-Training-Moshi
The reproduce training process for Moshi
Language:Python74 7 05
bfs18/e2_tts
Language:Python62 6 177
shiml20/FlowTurbo
Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"
Language:Jupyter Notebook603
facebookresearch/Qinco
Residual Quantization with Implicit Neural Codebooks
Language:Python49 1 22
necludov/wl-mechanics
Language:Jupyter Notebook31 1 01