Pinned Repositories
CAT-Seg
Official Implementation of "CAT-Segš±: Cost Aggregation for Open-Vocabulary Semantic Segmentation"
CLIM
[AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation
CLIP
CLIPSelf
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
colorization
This is the code of the colorization project of the National Innovation Program.
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
F-LMM
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
multiview_pose
[ICCV2021] Code Release of Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
ovdet
[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
wusize.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
wusize's Repositories
wusize/ovdet
[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
wusize/CLIPSelf
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
wusize/multiview_pose
[ICCV2021] Code Release of Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
wusize/CLIM
[AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation
wusize/F-LMM
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
wusize/wusize.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
wusize/colorization
This is the code of the colorization project of the National Innovation Program.
wusize/CAT-Seg
Official Implementation of "CAT-Segš±: Cost Aggregation for Open-Vocabulary Semantic Segmentation"
wusize/CLIP
wusize/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
wusize/LLaVA-Grounding
wusize/lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
wusize/open_clip-1
An open source implementation of CLIP.
wusize/OVD_Contest
wusize/RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
wusize/SAN
Open-vocabulary Semantic Segmentation
wusize/UNINEXT
[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval
wusize/Visual-CoT
Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model