CVPR 2021 论文和开源项目合集(Papers with Code)

CVPR 2021 论文和开源项目合集(papers with code)!

CVPR 2021 收录列表:http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt

注1:欢迎各位大佬提交issue,分享CVPR 2021论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

CVPR 2021 中奖群已成立!已经收录的同学,可以添加微信:CVer9999,请备注:CVPR2021已收录+姓名+学校/公司名称!一定要根据格式申请,可以拉你进群沟通开会等事宜。

【CVPR 2021 论文开源目录】

Backbone

ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

Involution: Inverting the Inherence of Convolution for Visual Recognition

Coordinate Attention for Efficient Mobile Network Design

Inception Convolution with Efficient Dilation Search

RepVGG: Making VGG-style ConvNets Great Again

NAS

Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

Inception Convolution with Efficient Dilation Search

GAN

HumanGAN: A Generative Model of Humans Images

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis

CoMoGAN: continuous model-guided image-to-image translation

Training Generative Adversarial Networks in One Stage

Closed-Form Factorization of Latent Semantics in GANs

Anycost GANs for Interactive Image Synthesis and Editing

Image-to-image Translation via Hierarchical Style Disentanglement

Visual Transformer

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Pre-Trained Image Processing Transformer

End-to-End Video Instance Segmentation with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

End-to-End Human Object Interaction Detection with HOI Transformer

Transformer Interpretability Beyond Attention Visualization

Regularization

Regularizing Neural Networks via Adversarial Model Perturbation

无监督/自监督(Un/Self-Supervised)

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

Spatially Consistent Representation Learning

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

Exploring Simple Siamese Representation Learning

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

半监督学习(Semi-Supervised )

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

2D/遥感目标检测(Object Detection)

2D目标检测

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

YOLOF:You Only Look One-level Feature

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

General Instance Distillation for Object Detection

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

Multiple Instance Active Learning for Object Detection

Towards Open World Object Detection

Few-Shot目标检测

Few-Shot Object Detection via Contrastive Proposal Encoding

旋转目标检测

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

单/多目标跟踪(Object Tracking)

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Track to Detect and Segment: An Online Multi-Object Tracker

实例分割(Instance Segmentation)

End-to-End Video Instance Segmentation with Transformers

Zero-shot instance segmentation(Not Sure)

全景分割(Panoptic Segmentation)

Fully Convolutional Networks for Panoptic Segmentation

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

医学图像分割

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

交互式视频目标分割(Interactive-Video-Object-Segmentation)

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

视频理解/行为识别(Video Understanding)

ACTION-Net: Multipath Excitation for Action Recognition

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

TDN: Temporal Difference Networks for Efficient Action Recognition

人脸识别(Face Recognition)

MagFace: A Universal Representation for Face Recognition and Quality Assessment

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

人脸检测(Face Detection)

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

人脸活体检测(Face Anti-Spoofing)

Cross Modal Focal Loss for RGBD Face Anti-Spoofing

Deepfake检测(Deepfake Detection)

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

Multi-attentional Deepfake Detection

人脸年龄估计(Age Estimation)

PML: Progressive Margin Loss for Long-tailed Age Classification

人体解析(Human Parsing)

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

2D/3D人体姿态估计(2D/3D Human Pose Estimation)

2D 人体姿态估计

DCPose: Deep Dual Consecutive Network for Human Pose Estimation

3D 人体姿态估计

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

场景文本识别(Scene Text Recognition)

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

模型压缩/剪枝/量化

模型量化

Learnable Companding Quantization for Accurate Low-bit Neural Networks

超分辨率(Super-Resolution)

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

AdderSR: Towards Energy Efficient Image Super-Resolution

图像恢复(Image Restoration)

Multi-Stage Progressive Image Restoration

反光去除(Reflection Removal)

Robust Reflection Removal with Reflection-free Flash-only Cues

3D目标检测(3D Object Detection)

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

Center-based 3D Object Detection and Tracking

Categorical Depth Distribution Network for Monocular 3D Object Detection

3D语义分割(3D Semantic Segmentation)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

3D目标跟踪(3D Object Trancking)

Center-based 3D Object Detection and Tracking

3D点云配准(3D Point Cloud Registration)

PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency

PREDATOR: Registration of 3D Point Clouds with Low Overlap

3D点云补全(3D Point Cloud Completion)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

6D位姿估计(6D Pose Estimation)

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

相机姿态估计

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose

深度估计

Beyond Image to Depth: Improving Depth Prediction using Echoes

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

Depth from Camera Motion and Object Detection

对抗样本

Natural Adversarial Examples

图像检索(Image Retrieval)

QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval

视频检索(Video Retrieval)

On Semantic Similarity in Video Retrieval

Zero-Shot Learning

Counterfactual Zero-Shot and Open-Set Visual Recognition

联邦学习(Federated Learning)

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

视频插帧(Video Frame Interpolation)

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

视觉推理(Visual Reasoning)

Transformation Driven Visual Reasoning

视图合成(View Synthesis)

NeX: Real-time View Synthesis with Neural Basis Expansion

DomainGeneralization

FSDR: Frequency Space Domain Randomization for Domain Generalization

"人-物"交互(HOI)检测

Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information

Reformulating HOI Detection as Adaptive Set Prediction

Detecting Human-Object Interaction via Fabricated Compositional Learning

End-to-End Human Object Interaction Detection with HOI Transformer

阴影去除(Shadow Removal)

Auto-Exposure Fusion for Single-Image Shadow Removal

虚拟换衣(Virtual Try-On)

Parser-Free Virtual Try-on via Distilling Appearance Flows

基于外观流蒸馏的无需人体解析的虚拟换装

数据集(Datasets)

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

Depth from Camera Motion and Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

其他(Others)

Knowledge Evolution in Neural Networks

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

SGP: Self-supervised Geometric Perception

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

Diffusion Probabilistic Models for 3D Point Cloud Generation

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

待添加(TODO)

不确定中没中(Not Sure)

CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models

Toward Explainable Reflection Removal with Distilling and Model Uncertainty

DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation

Exploring Adversarial Fake Images on Face Manifold

Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task

Temporal Contrastive Graph for Self-supervised Video Representation Learning

Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching

Fast and Memory-Efficient Compact Bilinear Pooling

Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine

Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation

https://github.com/ShaoQiangShen/CVPR2021

https://github.com/gillesflash/CVPR2021

https://github.com/anonymous-submission1991/BaLeNAS

https://github.com/cvpr2021dcb/cvpr2021dcb

https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578

https://github.com/AldrichZeng/FreqPrune

https://github.com/Anonymous-AdvCAM/Anonymous-AdvCAM

https://github.com/ddfss/datadrive-fss