mingfei-gao's Stars
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
salesforce/ALBEF
Code for ALBEF: a new vision-language pre-training method
yuewang-cuhk/awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
hila-chefer/Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
jokieleung/awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
robustness-gym/robustness-gym
Robustness Gym is an evaluation toolkit for machine learning.
xinke-wang/Awesome-Text-VQA
xumingze0308/TRN.pytorch
[ICCV 2019] Official implementation of Temporal Recurrent Networks for Online Action Detection
salesforce/PB-OVD
A pytorch Implementation of Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
LoyoYang/DeCoTa
ICCV 2021: Deep Co-Training with Task Decomposition for Semi-supervised Domain Adaptation
salesforce/QVR-SimpleDLM
Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.
salesforce/burn-after-reading
salesforce/woad-pytorch
This is the pytorch implementation of WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos (CVPR2021).
salesforce/inv-cdip
INV-CDIP Dataset
salesforce/fieldExtractor