Totooroo's Stars
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
MooreThreads/Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
3DTopia/ThemeStation
[SIGGRAPH 2024] ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars
ex3ndr/supervoice-voicebox
VoiceBox neural network implementation
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
lukasHoel/text2room
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
willisma/SiT
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
thu-ml/CRM
[ECCV 2024] Single Image to 3D Textured Mesh in 10 seconds with Convolutional Reconstruction Model.
huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
felipemaiapolo/tinyBenchmarks
Evaluating LLMs with fewer examples
Alpha-VLLM/LLaMA2-Accessory
An Open-source Toolkit for LLM Development
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
SillyTavern/SillyTavern
LLM Frontend for Power Users.
erew123/alltalk_tts
AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
openai/openai-cookbook
Examples and guides for using the OpenAI API
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
brycedrennan/imaginAIry
Pythonic AI generation of images and videos
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head