Pinned Repositories
DyDiT
The official implementation of "2025ICLR Dynamic Diffusion Transformer" and "2025ArXivDyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation".
Dynamic-Diffusion-Transformer
Dynamic-Tuning
The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"
SGL
2021CVPR-WSVSOD
The code for CVPR2021 Weakly Supervised Video Salient Object Detection
2021ICCV-DLGLRG
The code for ICCV2021 Light Field Saliency Detection with Dual Local Graph Learning and Reciprocative Guidance
2021TIP-SCG
The code for SCG: Saliency and Contour Guided Salient Instance Segmentation
2022CVPR-MMMMTBVS
This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"
COSNet
See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks (CVPR19)
OpenMMLab-BoxInst
The code for OpenmmLab challenge.
wangbo-zhao's Repositories
wangbo-zhao/2022CVPR-MMMMTBVS
This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"
wangbo-zhao/2021TIP-SCG
The code for SCG: Saliency and Contour Guided Salient Instance Segmentation
wangbo-zhao/Latte
The official implementation of Latte: Latent Diffusion Transformer for Video Generation.
wangbo-zhao/aot-benchmark
An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorch
wangbo-zhao/binsformer
Implementation of Binsformer code
wangbo-zhao/ColossalAI
Making big AI models cheaper, easier, and scalable
wangbo-zhao/DAPT
[CVPR 2024] Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis
wangbo-zhao/DiffRate
[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techniques, while incorporating a differentiable compression rate.
wangbo-zhao/DiST
ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
wangbo-zhao/EfficientDM
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models"
wangbo-zhao/Image-Generation-CoT
Investigating CoT Reasoning in Autoregressive Image Generation
wangbo-zhao/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
wangbo-zhao/langchain
⚡ Building applications with LLMs through composability ⚡
wangbo-zhao/mdy_triton
wangbo-zhao/MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
wangbo-zhao/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
wangbo-zhao/mmselfsup
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
wangbo-zhao/opencompass
OpenCompass is an LLM evaluation platform, supporting evaluation of 20+ HuggingFace & API models (LLaMA, ChatGPT, Claude, etc) over 50+ datasets. It enables fast, comprehensive benchmarking of large models using efficient distributed evaluation techniques.
wangbo-zhao/Papers-Literature-ML-DL-RL-AI
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
wangbo-zhao/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
wangbo-zhao/PixArt-alpha
Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
wangbo-zhao/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
wangbo-zhao/T-Stitch
Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching"
wangbo-zhao/TPDM
Implementation of "Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation" [CVPR 2025]
wangbo-zhao/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
wangbo-zhao/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
wangbo-zhao/VITA
VITA: Video Instance Segmentation via Object Token Association (NeurIPS 2022)
wangbo-zhao/VLM-R1
Solve Visual Understanding with Reinforced VLMs
wangbo-zhao/wangbo-zhao.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
wangbo-zhao/X-Decoder
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language