chenshuo20's Stars
HengyiWang/spann3r
3D Reconstruction with Spatial Memory
cocktailpeanut/fluxgym
Dead simple FLUX LoRA training UI with LOW VRAM support
Drexubery/ViewCrafter
Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"
Picsart-AI-Research/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
liuff19/ReconX
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
btsmart/splatt3r
Official repository for Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs
Vchitect/Latte
Latte: Latent Diffusion Transformer for Video Generation.
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Bujiazi/MotionClone
Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Florian-Barthel/splatviz
Full python interactive 3D Gaussian Splatting viewer for real-time editing and analyzing.
XLabs-AI/x-flux
PixArt-alpha/PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Vchitect/VEnhancer
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
DL3DV-10K/Dataset
News: the 10k dataset is ready for download.
apple/ml-mdm
Train high-quality text-to-image diffusion models in a data & compute efficient manner
SAIS-FUXI/VidGen
THUDM/CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
black-forest-labs/flux
Official inference repo for FLUX.1 models
IDEA-Research/TAPTR
[ECCV 2024] Official implementation of the paper "TAPTR: Tracking Any Point with Transformers as Detection"
NVlabs/InstantSplat
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
colmap/glomap
GLOMAP - Global Structured-from-Motion Revisited
nianticlabs/acezero
[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.
lilygoli/SpotLessSplats
Code for SpotLessSplats: Ignoring Distractors in 3D Gaussian Splatting built on gsplat codebase.
louaaron/Score-Entropy-Discrete-Diffusion
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
mihirp1998/VADER
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.
huggingface/trl
Train transformer language models with reinforcement learning.
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.