songweige

University of Maryland, College Park

songweige's Stars

NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python9.5k2.1k
davidmcall/SDS-Bridge
Official Implementation of Rethinking Score Distillation as a Bridge Between Image Distributions
Language:Python152
linzhiqiu/t2v_metrics
Evaluating text-to-image/video/3D models with VQAScore
Language:Python13715
FoundationVision/OmniTokenizer
OmniTokenizer: one model and one weight for image-video joint tokenization.
Language:Python2014
NVlabs/CMD
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition (ICLR 2024)
Language:Python21
lab4d-org/lab4d
A framework for 4D reconstruction from monocular videos.
Language:Python24215
web-arena-x/webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
Language:Python65194
GaParmar/img2img-turbo
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
Language:Python1.4k145
mit-han-lab/distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Language:Python51316
ExponentialML/Text-To-Video-Finetuning
Finetune ModelScope's Text To Video model using Diffusers 🧨
Language:Python647106
stas00/ml-engineering
Machine Learning Engineering Open Book
Language:Python10.3k618
nerfstudio-project/viser
Web-based 3D visualization + Python
Language:Python61634
openai/consistencydecoder
Consistency Distilled Diff VAE
Language:Python2.1k74
facebookresearch/TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Language:Python1.5k206
sihyun-yu/PVDM
Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).
Language:Python29015
magic-wormhole/magic-wormhole
get things from one computer to another, safely
Language:Python18.5k606
facebookresearch/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Language:Python6.4k1.2k
text2cinemagraph/text2cinemagraph
Text2Cinemagraph: Text-Guided Synthesis of Eulerian Cinemagraphs [SIGGRAPH ASIA 2023]
Language:Python35843
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Language:Python1.3k128
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python137k26k
vye16/slahmr
Language:Python44450
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Language:Python1237
cmu-ci-lab/writing
Writing suggestions and resources for CIRL
Language:TeX26
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
Language:Python4.1k341
OpenGVLab/VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Language:Python46047
OpenGVLab/unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Language:Python27112
songweige/rich-text-to-image
Rich-Text-to-Image Generation
Language:Python74961
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python5.8k613
bfshi/TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
Language:Python18510
ssundaram21/dreamsim
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight)
Language:Python33316

songweige

songweige's Stars

NVIDIA/Megatron-LM

davidmcall/SDS-Bridge

linzhiqiu/t2v_metrics

FoundationVision/OmniTokenizer

NVlabs/CMD

lab4d-org/lab4d

web-arena-x/webarena

GaParmar/img2img-turbo

mit-han-lab/distrifuser

ExponentialML/Text-To-Video-Finetuning

stas00/ml-engineering

nerfstudio-project/viser

openai/consistencydecoder

facebookresearch/TimeSformer

sihyun-yu/PVDM

magic-wormhole/magic-wormhole

facebookresearch/SlowFast

text2cinemagraph/text2cinemagraph

MCG-NJU/VideoMAE

AUTOMATIC1111/stable-diffusion-webui

vye16/slahmr

Yushi-Hu/tifa

cmu-ci-lab/writing

wkentaro/gdown

OpenGVLab/VideoMAEv2

OpenGVLab/unmasked_teacher

songweige/rich-text-to-image

IDEA-Research/GroundingDINO

bfshi/TOAST

ssundaram21/dreamsim