bfs18's Stars
yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
shadps4-emu/shadPS4
PS4 emulator for Windows,Linux,MacOS
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
kyutai-labs/moshi
yangjianxin1/Firefly
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
facebookresearch/sapiens
High-resolution models for human tasks.
livekit/agents
Build real-time multimodal AI applications 🤖🎙️📹
AnswerDotAI/gpu.cpp
A lightweight library for portable low-level GPU computation using WebGPU.
TEN-framework/TEN-Agent
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compatible with popular workflow platforms like Dify and Coze.
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
pytorch/torchtitan
A native PyTorch Library for large model training
VadimBoev/FlappyBird
Less than 100 Kilobytes. Works for Android 5.1 and above
homebrewltd/ichigo
Local realtime voice AI
EurekaLabsAI/ngram
The n-gram Language Model
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
EurekaLabsAI/micrograd
The Autograd Engine
hubertsiuzdak/snac
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
THUDM/CogView3
text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
EmilianPostolache/stable-audio-controlnet
Fine-tune Stable Audio Open with DiT ControlNet.
bfs18/rfwave
yangdongchao/Open-Training-Moshi
The reproduce training process for Moshi
bfs18/e2_tts
shiml20/FlowTurbo
Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"
facebookresearch/Qinco
Residual Quantization with Implicit Neural Codebooks
necludov/wl-mechanics