CVPR 2021 论文和开源项目合集(papers with code)!
CVPR 2021 收录列表:http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt
注1:欢迎各位大佬提交issue,分享CVPR 2021论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
CVPR 2021 中奖群已成立!已经收录的同学,可以添加微信:CVer9999,请备注:CVPR2021已收录+姓名+学校/公司名称!一定要根据格式申请,可以拉你进群沟通开会等事宜。
- Backbone
- NAS
- GAN
- Visual Transformer
- 无监督/自监督(Self-Supervised)
- 半监督(Semi-Supervised)
- 目标检测(Object Detection)
- 实例分割(Instance Segmentation)
- 全景分割(Panoptic Segmentation)
- 医学图像分割(Medical Image Segmentation)
- 视频理解/行为识别(Video Understanding)
- 人脸识别(Face Recognition)
- 人脸检测(Face Detection)
- 人脸活体检测(Face Anti-Spoofing)
- Deepfake检测(Deepfake Detection)
- 人脸年龄估计(Age-Estimation)
- 人体解析(Human Parsing)
- 2D/3D人体姿态估计(2D/3D Human Pose Estimation)
- 场景文本识别(Scene Text Recognition)
- 超分辨率(Super-Resolution)
- 图像恢复(Image Restoration)
- 3D目标检测(3D Object Detection)
- 3D语义分割(3D Semantic Segmentation)
- 3D目标跟踪(3D Object Tracking)
- 3D点云配准(3D Point Cloud Registration)
- 3D点云补全(3D-Point-Cloud-Completion)
- 6D位姿估计(6D Pose Estimation)
- 深度估计(Depth Estimation)
- 对抗样本(Adversarial-Examples)
- 图像检索(Image Retrieval)
- Zero-Shot Learning
- 联邦学习(Federated Learning)
- 视频插帧(Video Frame Interpolation)
- 视觉推理(Visual Reasoning)
- "人-物"交互(HOI)检测
- 阴影去除(Shadow Removal)
- 虚拟试衣
- 数据集(Datasets)
- 其他(Others)
- 待添加(TODO)
- 不确定中没中(Not Sure)
Involution: Inverting the Inherence of Convolution for Visual Recognition
Coordinate Attention for Efficient Mobile Network Design
Inception Convolution with Efficient Dilation Search
- Paper: https://arxiv.org/abs/2012.13587
- Code: None
RepVGG: Making VGG-style ConvNets Great Again
OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
Inception Convolution with Efficient Dilation Search
- Paper: https://arxiv.org/abs/2012.13587
- Code: None
ID-Unet: Iterative Soft and Hard Deformation for View Synthesis
CoMoGAN: continuous model-guided image-to-image translation
Training Generative Adversarial Networks in One Stage
- Paper: https://arxiv.org/abs/2103.00430
- Code: None
Closed-Form Factorization of Latent Semantics in GANs
- Homepage: https://genforce.github.io/sefa/
- Paper: https://arxiv.org/abs/2007.06600
- Code: https://github.com/genforce/sefa
Anycost GANs for Interactive Image Synthesis and Editing
Image-to-image Translation via Hierarchical Style Disentanglement
End-to-End Video Instance Segmentation with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.14503
- Code: None
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.09094
- Code: https://github.com/dddzg/up-detr
End-to-End Human Object Interaction Detection with HOI Transformer
Transformer Interpretability Beyond Attention Visualization
- Paper: https://arxiv.org/abs/2012.09838
- Code: https://github.com/hila-chefer/Transformer-Explainability
Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning
- Homepage: https://fingerrec.github.io/index_files/jinpeng/papers/CVPR2021/project_website.html
- Paper: https://arxiv.org/abs/2009.05769
- Code: https://github.com/FingerRec/BE
Spatially Consistent Representation Learning
- Paper: https://arxiv.org/abs/2103.06122
- Code: None
VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
Exploring Simple Siamese Representation Learning
- Paper(Oral): https://arxiv.org/abs/2011.10566
- Code: None
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
- Paper(Oral): https://arxiv.org/abs/2011.09157
- Code: https://github.com/WXinlong/DenseCL
Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
- Paper: https://arxiv.org/abs/2103.02193
- Code: https://github.com/SHI-Labs/Semi-Supervised-Transfer-Learning
OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
YOLOF:You Only Look One-level Feature
- Paper(Oral): None
- Code: https://github.com/megvii-model/YOLOF
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.09094
- Code: https://github.com/dddzg/up-detr
General Instance Distillation for Object Detection
- Paper: https://arxiv.org/abs/2103.02340
- Code: None
Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
- Paper: https://arxiv.org/abs/2103.01903
- Code: None
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
- Homepage: http://rl.uni-freiburg.de/research/multimodal-distill
- Paper: https://arxiv.org/abs/2103.01353
- Code: http://rl.uni-freiburg.de/research/multimodal-distill
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Multiple Instance Active Learning for Object Detection
Towards Open World Object Detection
End-to-End Video Instance Segmentation with Transformers
- Paper(Oral): https://arxiv.org/abs/2011.14503
- Code: https://github.com/Epiphqny/VisTR
Zero-shot instance segmentation(Not Sure)
- Paper: None
- Code: https://github.com/CVPR2021-pape-id-1395/CVPR2021-paper-id-1395
Cross-View Regularization for Domain Adaptive Panoptic Segmentation
- Paper: https://arxiv.org/abs/2103.02584
- Code: None
FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
ACTION-Net: Multipath Excitation for Action Recognition
Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning
- Homepage: https://fingerrec.github.io/index_files/jinpeng/papers/CVPR2021/project_website.html
- Paper: https://arxiv.org/abs/2009.05769
- Code: https://github.com/FingerRec/BE
TDN: Temporal Difference Networks for Efficient Action Recognition
MagFace: A Universal Representation for Face Recognition and Quality Assessment
- Paper(Oral): https://arxiv.org/abs/2103.06627
- Code: https://github.com/IrvingMeng/MagFace
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
- Homepage: https://www.face-benchmark.org/
- Paper: https://arxiv.org/abs/2103.04098
- Dataset: https://www.face-benchmark.org/
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
- Paper(Oral): https://arxiv.org/abs/2103.01520
- Code: https://github.com/Hzzone/MTLFace
- Dataset: https://github.com/Hzzone/MTLFace
CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement
- Paper: https://arxiv.org/abs/2103.07017
- Code: None
Cross Modal Focal Loss for RGBD Face Anti-Spoofing
- Paper: https://arxiv.org/abs/2103.00948
- Code: None
Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
- Paper:https://arxiv.org/abs/2103.01856
- Code: None
Multi-attentional Deepfake Detection
- Paper:https://arxiv.org/abs/2103.02406
- Code: None
PML: Progressive Margin Loss for Long-tailed Age Classification
- Paper: https://arxiv.org/abs/2103.02140
- Code: None
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation
- Homepage: https://jeffli.site/HybrIK/
- Paper: https://arxiv.org/abs/2011.14672
- Code: https://github.com/Jeff-sjtu/HybrIK
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
AdderSR: Towards Energy Efficient Image Super-Resolution
- Paper: https://arxiv.org/abs/2009.08891
- Code: None
Multi-Stage Progressive Image Restoration
SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud
- Paper: None
- Code: https://github.com/Vegeta2020/SE-SSD
Center-based 3D Object Detection and Tracking
Categorical Depth Distribution Network for Monocular 3D Object Detection
- Paper: https://arxiv.org/abs/2103.01100
- Code: None
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
- Homepage: https://github.com/QingyongHu/SensatUrban
- Paper: http://arxiv.org/abs/2009.03137
- Code: https://github.com/QingyongHu/SensatUrban
- Dataset: https://github.com/QingyongHu/SensatUrban
Center-based 3D Object Detection and Tracking
PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency
PREDATOR: Registration of 3D Point Clouds with Low Overlap
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
- Paper: https://arxiv.org/abs/2103.02535
- Code: None
GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
- Paper: http://arxiv.org/abs/2102.12145
- code: https://git.io/GDR-Net
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation
- Paper: https://arxiv.org/abs/2103.02396
- Code: None
Depth from Camera Motion and Object Detection
- Paper: https://arxiv.org/abs/2103.01468
- Code: https://github.com/griffbr/ODMD
- Dataset: https://github.com/griffbr/ODMD
Natural Adversarial Examples
QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
- Paper: https://arxiv.org/abs/2103.02927
- Code: None
Counterfactual Zero-Shot and Open-Set Visual Recognition
FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
-
Homepage: https://tarun005.github.io/FLAVR/
Transformation Driven Visual Reasoning
-
homepage: https://hongxin2019.github.io/TVR/
End-to-End Human Object Interaction Detection with HOI Transformer
Auto-Exposure Fusion for Single-Image Shadow Removal
- Paper: https://arxiv.org/abs/2103.01255
- Code: https://github.com/tsingqguo/exposure-fusion-shadow-removal
Parser-Free Virtual Try-on via Distilling Appearance Flows
基于外观流蒸馏的无需人体解析的虚拟换装
Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
- Paper: https://arxiv.org/abs/2103.03375
- Dataset: None
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
- Homepage: https://github.com/QingyongHu/SensatUrban
- Paper: http://arxiv.org/abs/2009.03137
- Code: https://github.com/QingyongHu/SensatUrban
- Dataset: https://github.com/QingyongHu/SensatUrban
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
- Paper(Oral): https://arxiv.org/abs/2103.01520
- Code: https://github.com/Hzzone/MTLFace
- Dataset: https://github.com/Hzzone/MTLFace
Depth from Camera Motion and Object Detection
- Paper: https://arxiv.org/abs/2103.01468
- Code: https://github.com/griffbr/ODMD
- Dataset: https://github.com/griffbr/ODMD
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
- Homepage: http://rl.uni-freiburg.de/research/multimodal-distill
- Paper: https://arxiv.org/abs/2103.01353
- Code: http://rl.uni-freiburg.de/research/multimodal-distill
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
- Paper: https://arxiv.org/abs/2103.01353
- Code: http://rl.uni-freiburg.de/research/multimodal-distill
- Dataset: http://rl.uni-freiburg.de/research/multimodal-distill
Knowledge Evolution in Neural Networks
- Paper(Oral): https://arxiv.org/abs/2103.05152
- Code: https://github.com/ahmdtaha/knowledge_evolution
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
SGP: Self-supervised Geometric Perception
-
Oral
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
Diffusion Probabilistic Models for 3D Point Cloud Generation
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models
- Paper: none
- Code: https://github.com/transcendentsky/Film-Recovery
Toward Explainable Reflection Removal with Distilling and Model Uncertainty
- Paper: none
- Code: https://github.com/ytpeng-aimlab/CVPR-2021-Toward-Explainable-Reflection-Removal-with-Distilling-and-Model-Uncertainty
DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation
- Paper: none
- Code: https://github.com/lhaippp/DeepOIS
Exploring Adversarial Fake Images on Face Manifold
- Paper: none
- Code: https://github.com/ldz666666/Style-atk
Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task
- Paper: none
- Code: https://github.com/yandamengdanai/Uncertainty-Aware-Semi-Supervised-Crowd-Counting-via-Consistency-Regularized-Surrogate-Task
Temporal Contrastive Graph for Self-supervised Video Representation Learning
- Paper: none
- Code: https://github.com/YangLiu9208/TCG
Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching
- Paper: none
- Code: https://github.com/ouranonymouscvpr/cvpr2021_ouranonymouscvpr
Fast and Memory-Efficient Compact Bilinear Pooling
- Paper: none
- Code: https://github.com/cvpr2021kp2/cvpr2021kp2
Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine
- Paper: none
- Code: https://github.com/gapDetection/cvpr2021
Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation
- Paper: none
- Code: https://github.com/interactivekeypoint2020/Morph
https://github.com/ShaoQiangShen/CVPR2021
https://github.com/gillesflash/CVPR2021
https://github.com/anonymous-submission1991/BaLeNAS
https://github.com/cvpr2021dcb/cvpr2021dcb
https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578
https://github.com/AldrichZeng/FreqPrune