Pinned Repositories
bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Oscar
Oscar and VinVL
EILEV
EILEV: Efficient In-Context Learning in Vision-Language Models for Egocentric Videos
pleasurepants's Repositories
pleasurepants/bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
pleasurepants/Oscar
Oscar and VinVL