
Date Speaker Title
7.12 Zongxin End-to-end Object Detection with Transformers
7.12 Yu In Defense of Grid Features for Visual Question Answering
7.19 Xiaohan UNITER: UNiversal Image-TExt Representation Learning
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Self-Supervised MultiModal Versatile Networks
7.26 Ruijie Long-term Human Motion Prediction with Scene Context
7.26 Youjiang Discovering Human Interactions with Novel Objects via Zero-Shot Learning
Learning Human-Object Interaction Detection using Interaction Points
8.02 Guang Exploring Self-attention for Image Recognition
8.02 Yuanzhi Visual Commonsense R-CNN
8.09 Pingbo Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One
8.09 Chao corner proposal network for anchor-free two-stage object detection
8.16 Yunqiu Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
8.16 Chen Self-training with Noisy Student improves ImageNet classification
8.23 Xuanmeng
8.23 Yutian
8.30 Yuhang
8.30 Jiaxu