songweige's Stars
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
davidmcall/SDS-Bridge
Official Implementation of Rethinking Score Distillation as a Bridge Between Image Distributions
linzhiqiu/t2v_metrics
Evaluating text-to-image/video/3D models with VQAScore
FoundationVision/OmniTokenizer
OmniTokenizer: one model and one weight for image-video joint tokenization.
NVlabs/CMD
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition (ICLR 2024)
lab4d-org/lab4d
A framework for 4D reconstruction from monocular videos.
web-arena-x/webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
GaParmar/img2img-turbo
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
mit-han-lab/distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
ExponentialML/Text-To-Video-Finetuning
Finetune ModelScope's Text To Video model using Diffusers 🧨
stas00/ml-engineering
Machine Learning Engineering Open Book
nerfstudio-project/viser
Web-based 3D visualization + Python
openai/consistencydecoder
Consistency Distilled Diff VAE
facebookresearch/TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
sihyun-yu/PVDM
Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).
magic-wormhole/magic-wormhole
get things from one computer to another, safely
facebookresearch/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
text2cinemagraph/text2cinemagraph
Text2Cinemagraph: Text-Guided Synthesis of Eulerian Cinemagraphs [SIGGRAPH ASIA 2023]
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
vye16/slahmr
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
cmu-ci-lab/writing
Writing suggestions and resources for CIRL
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
OpenGVLab/VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
OpenGVLab/unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
songweige/rich-text-to-image
Rich-Text-to-Image Generation
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
bfshi/TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
ssundaram21/dreamsim
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight)