jgrayla's Stars
wenqsun/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
xdit-project/mochi-xdit
faster parallel inference of mochi-1 video generation model
fishaudio/fish-speech
Brand new TTS solution
instantX-research/Regional-Prompting-FLUX
Training-free Regional Prompting for Diffusion Transformers 🔥
Jonseed/ComfyUI-Detail-Daemon
A port of muerrilla's sd-webui-Detail-Daemon as a node for ComfyUI, to adjust sigmas that control detail.
Tencent/Hunyuan3D-1
bytedance/1d-tokenizer
This repo contains the code for 1D tokenizer and generator
zzyunzhi/scene-language
The Scene Language: Representing Scenes with Programs, Words, and Embeddings (arXiv preprint)
aim-uofa/Framer
Official PyTorch implementation of "Framer: Interactive Frame Interpolation".
Hanbo-Cheng/DAWN-pytorch
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
gpt-omni/mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
genmoai/models
The best OSS video generation models
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
whlzy/FiT
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
Correr-Zhou/MagicTailor
Offical implementation of the paper "MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models".
VectorSpaceLab/Video-XL
🔥🔥First-ever hour scale video understanding models
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
mit-han-lab/hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
viiika/Meissonic
I'm back! Implementations of Meissonic developed by Community~If you feel it is helpful, plz consider giving a star❤️
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
showlab/EvolveDirector
[NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
westlake-baichuan-mllm/bc-omni
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
ironjr/StreamMultiDiffusion
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
numz/Comfyui-FlowChain
Convert your workflows into nodes and chain them together
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
hustvl/ControlAR
Official code for "ControlAR: Controllable Image Generation with Autoregressive Models"
XmYx/deforum-comfy-nodes
Deforum ComfyUI Nodes - ai animation node package
IDGallagher/ComfyUI-IG-Motion-I2V
ComfyUI implementation of Motion-I2V
Suzie1/ComfyUI_Guide_To_Making_Custom_Nodes
A guide to making custom nodes in ComfyUI
apple/ml-depth-pro
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.