Pinned Repositories
CRN_tvqa
grid-feats-vqa
Grid features pre-training code for visual question answering
image-captioning-DLCT
Official pytorch implementation of paper "Duel-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
PositionalMCAN
MCAN+PA
Scan2Cap
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
three.js
JavaScript 3D library.
VL-BERT
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
waizei's Repositories
waizei/PositionalMCAN
MCAN+PA
waizei/CRN_tvqa
waizei/grid-feats-vqa
Grid features pre-training code for visual question answering
waizei/image-captioning-DLCT
Official pytorch implementation of paper "Duel-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
waizei/Scan2Cap
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
waizei/three.js
JavaScript 3D library.
waizei/VL-BERT
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".