CVPR 2021 论文和开源项目合集(Papers with Code)

CVPR 2021 论文和开源项目合集(papers with code)!

CVPR 2021 收录列表:http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt

注1:欢迎各位大佬提交issue,分享CVPR 2021论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

CVPR 2021 中奖群已成立!已经收录的同学,可以添加微信:CVer9999,请备注:CVPR2021已收录+姓名+学校/公司名称!一定要根据格式申请,可以拉你进群沟通开会等事宜。

【CVPR 2021 论文开源目录】

Backbone

Involution: Inverting the Inherence of Convolution for Visual Recognition

Coordinate Attention for Efficient Mobile Network Design

Inception Convolution with Efficient Dilation Search

RepVGG: Making VGG-style ConvNets Great Again

NAS

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

Inception Convolution with Efficient Dilation Search

GAN

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis

CoMoGAN: continuous model-guided image-to-image translation

Training Generative Adversarial Networks in One Stage

Closed-Form Factorization of Latent Semantics in GANs

Anycost GANs for Interactive Image Synthesis and Editing

Image-to-image Translation via Hierarchical Style Disentanglement

Visual Transformer

End-to-End Video Instance Segmentation with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

End-to-End Human Object Interaction Detection with HOI Transformer

Transformer Interpretability Beyond Attention Visualization

无监督/自监督(Un/Self-Supervised)

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

Spatially Consistent Representation Learning

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

Exploring Simple Siamese Representation Learning

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

半监督学习(Semi-Supervised )

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

目标检测(Object Detection)

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

YOLOF:You Only Look One-level Feature

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

General Instance Distillation for Object Detection

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

Multiple Instance Active Learning for Object Detection

Towards Open World Object Detection

实例分割(Instance Segmentation)

End-to-End Video Instance Segmentation with Transformers

Zero-shot instance segmentation(Not Sure)

全景分割(Panoptic Segmentation)

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

医学图像分割

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

视频理解/行为识别(Video Understanding)

ACTION-Net: Multipath Excitation for Action Recognition

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

TDN: Temporal Difference Networks for Efficient Action Recognition

人脸识别(Face Recognition)

MagFace: A Universal Representation for Face Recognition and Quality Assessment

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

人脸检测(Face Detection)

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

人脸活体检测(Face Anti-Spoofing)

Cross Modal Focal Loss for RGBD Face Anti-Spoofing

Deepfake检测(Deepfake Detection)

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

Multi-attentional Deepfake Detection

人脸年龄估计(Age Estimation)

PML: Progressive Margin Loss for Long-tailed Age Classification

人体解析(Human Parsing)

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

2D/3D人体姿态估计(2D/3D Human Pose Estimation)

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

场景文本识别(Scene Text Recognition)

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

超分辨率(Super-Resolution)

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

AdderSR: Towards Energy Efficient Image Super-Resolution

图像恢复(Image Restoration)

Multi-Stage Progressive Image Restoration

3D目标检测(3D Object Detection)

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

Center-based 3D Object Detection and Tracking

Categorical Depth Distribution Network for Monocular 3D Object Detection

3D语义分割(3D Semantic Segmentation)

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

3D目标跟踪(3D Object Trancking)

Center-based 3D Object Detection and Tracking

3D点云配准(3D Point Cloud Registration)

PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency

PREDATOR: Registration of 3D Point Clouds with Low Overlap

3D点云补全(3D Point Cloud Completion)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

6D位姿估计(6D Pose Estimation)

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

深度估计

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

Depth from Camera Motion and Object Detection

对抗样本

Natural Adversarial Examples

图像检索(Image Retrieval)

QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval

Zero-Shot Learning

Counterfactual Zero-Shot and Open-Set Visual Recognition

联邦学习(Federated Learning)

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

视频插帧(Video Frame Interpolation)

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

视觉推理(Visual Reasoning)

Transformation Driven Visual Reasoning

"人-物"交互(HOI)检测

End-to-End Human Object Interaction Detection with HOI Transformer

阴影去除(Shadow Removal)

Auto-Exposure Fusion for Single-Image Shadow Removal

虚拟换衣(Virtual Try-On)

Parser-Free Virtual Try-on via Distilling Appearance Flows

基于外观流蒸馏的无需人体解析的虚拟换装

数据集(Datasets)

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

Depth from Camera Motion and Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

其他(Others)

Knowledge Evolution in Neural Networks

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

SGP: Self-supervised Geometric Perception

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

Diffusion Probabilistic Models for 3D Point Cloud Generation

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

待添加(TODO)

不确定中没中(Not Sure)

CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models

Toward Explainable Reflection Removal with Distilling and Model Uncertainty

DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation

Exploring Adversarial Fake Images on Face Manifold

Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task

Temporal Contrastive Graph for Self-supervised Video Representation Learning

Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching

Fast and Memory-Efficient Compact Bilinear Pooling

Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine

Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation

https://github.com/ShaoQiangShen/CVPR2021

https://github.com/gillesflash/CVPR2021

https://github.com/anonymous-submission1991/BaLeNAS

https://github.com/cvpr2021dcb/cvpr2021dcb

https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578

https://github.com/AldrichZeng/FreqPrune

https://github.com/Anonymous-AdvCAM/Anonymous-AdvCAM

https://github.com/ddfss/datadrive-fss