JierunChen's Stars
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Kwai-Kolors/Kolors
Kolors Team
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
openai/improved-diffusion
Release for Improved Denoising Diffusion Probabilistic Models
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
huggingface/swift-coreml-diffusers
Swift app demonstrating Core ML Stable Diffusion
crowsonkb/k-diffusion
Karras et al. (2022) diffusion models for PyTorch
ChenyangSi/FreeU
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
PixArt-alpha/PixArt-sigma
PixArt-Ī£: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
gnobitab/InstaFlow
:zap: InstaFlow! One-Step Stable Diffusion with Rectified Flow (ICLR 2024)
mini-sora/minisora
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
chuanyangjin/fast-DiT
Fast Diffusion Models with Transformers
willisma/SiT
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
LambdaLabsML/lambda-diffusers
sail-sg/MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
whlzy/FiT
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
naver-ai/rope-vit
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
stas00/ipyexperiments
Automatic GPU+CPU memory profiling, re-use and memory leaks detection using jupyter/ipython experiment containers
djghosh13/geneval
GenEval: An object-focused framework for evaluating text-to-image alignment
UCSC-VLAA/Recap-DataComp-1B
This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
yangcaoai/3DGS-DET
Official codes for paper: 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection
eclipse-t2i/eclipse-inference
[CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"
EnVision-Research/LucidFusion
Official implementation of āLucidFusion: Generating 3D Gaussians with Arbitrary Unposed Imagesā
wusize/F-LMM
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
Anthrapper/On-Device-Stable-Diffusion
On Device Stable Diffusion In Mobile Devices
eclipse-t2i/lambda-eclipse-inference
[TMLR] Official PyTorch implementation of "Ī»-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space"
JierunChen/Ref-L4
Evaluation code for Ref-L4, a new REC benchmark in the LMM era
czkoko/SD-CoreML-Generator
Stable Diffusion CoreML Model Multi-resolution Generation Tool. No conversion, No more disk space.