2022/8/4 更新11篇

2022/7/29 更新 54 篇
2022/7/20 更新 54 篇


[4] Multimodal Object Detection via Probabilistic Ensembling (基于概率集成的多模态目标检测) (Oral)

paper | code

[3] Point-to-Box Network for Accurate Object Detection via Single Point Supervision (通过单点监督实现精确目标检测的点对盒网络)
paper | code

[2] You Should Look at All Objects (您应该查看所有物体)
paper | code

[1] Adversarially-Aware Robust Object Detector (对抗性感知鲁棒目标检测器)(Oral))
paper | code

[2] Densely Constrained Depth Estimator for Monocular 3D Object Detection (用于单目 3D 目标检测的密集约束深度估计器)
paper | code

[1] Rethinking IoU-based Optimization for Single-stage 3D Object Detection (重新思考基于 IoU 的单阶段 3D 对象检测优化)

[2] Discovering Human-Object Interaction Concepts via Self-Compositional Learning (通过自组合学习发现人-物交互概念)

paper | [code](;

[1] Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection (面向基于 DETR 的人机交互检测的硬性查询挖掘)
paper | code

[1] KD-SCFNet: Towards More Accurate and Efficient Salient Object Detection via Knowledge Distillation (KD-SCFNet:通过知识蒸馏实现更准确、更高效的显着目标检测)

paper | code

[2] DSR -- A dual subspace re-projection network for surface anomaly detection (DSR——用于表面异常检测的双子空间重投影网络)

paper | code

[1] DICE: Leveraging Sparsification for Out-of-Distribution Detection (DICE:利用稀疏化进行分布外检测)
paper | code

[3] In Defense of Online Models for Video Instance Segmentation (为视频实例分割的在线模型辩护) (Oral)

[2] Box-supervised Instance Segmentation with Level Set Evolution (具有水平集进化的框监督实例分割)

[1] OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers (OSFormer:使用 Transformers 进行单阶段伪装实例分割)
paper | code

[1] 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds (2DPASS:激光雷达点云上的二维先验辅助语义分割)
paper | code

[1] Learning Quality-aware Dynamic Memory for Video Object Segmentation (视频对象分割的学习质量感知动态内存)
paper | code

[3] Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution (学习高效图像超分辨率的串并行查找表)

paper | code

[2] Efficient Meta-Tuning for Content-aware Neural Video Delivery (内容感知神经视频交付的高效元调整)
paper | code

[1] Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks (超低精度超分辨率网络的动态双可训练边界)
paper | code

[9] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强:当层分解遇到光效抑制时)

paper | code

[8] Bringing Rolling Shutter Images Alive with Dual Reversed Distortion(通过双重反转失真使滚动快门图像重现) (Oral)
paper | code

[7] Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression (无监督夜间图像增强:当层分解遇到光效抑制时)
paper | code

[6] Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization (用于基于深度示例的着色的语义稀疏着色网络)

[5] Geometry-aware Single-image Full-body Human Relighting (几何感知单图像全身人体重新照明)

[4] Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion (单目全景深度补全的多模态蒙面预训练)

[3] PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation (PanoFormer:用于室内 360 深度估计的全景变压器)

[2] SESS: Saliency Enhancing with Scaling and Sliding (SESS:通过缩放和滑动增强显着性)

[1] RigNet: Repetitive Image Guided Network for Depth Completion (RigNet:用于深度补全的重复图像引导网络)

[1] Deep Portrait Delighting (深度人像去光)


[3] Perceiving and Modeling Density is All You Need for Image Dehazing (感知和建模密度是图像去雾所需的全部) (Oral)
paper |code

[2] Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance (来自模糊的动画:具有运动引导的多模态模糊分解)
paper | code

[1] Deep Semantic Statistics Matching (D2SM) Denoising Network (深度语义统计匹配(D2SM)去噪网络)

[1] Outpainting by Queries (通过查询进行外推)
paper | code

[1] CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer (CCPL:通用风格迁移的对比相干性保留损失) (Oral)
paper | code

[3] AlphaVC: High-Performance and Efficient Learned Video Compression (AlphaVC:高性能和高效的学习视频压缩)


[2] Improving the Perceptual Quality of 2D Animation Interpolation (提高二维动画插值的感知质量)
paper | code

[1] Real-Time Intermediate Flow Estimation for Video Frame Interpolation(视频帧插值的实时中间流估计)
paper | code

[1] Error Compensation Framework for Flow-Guided Video Inpainting (流引导视频修复的误差补偿框架)

[2] Event-guided Deblurring of Unknown Exposure Time Videos (未知曝光时间视频的事件引导去模糊) (Oral)


[1] Efficient Video Deblurring Guided by Motion Magnitude (由运动幅度引导的高效视频去模糊)

paper | code

[4] GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality (GaitEdge:超越普通的端到端步态识别,提高实用性)
paper | code

[3] Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition (用于跨域 3D 动作识别的协作域共享和特定于目标的特征聚类)
paper | code

[2] ReAct: Temporal Action Detection with Relational Queries (ReAct:使用关系查询的时间动作检测)
paper | code

[1] Hunting Group Clues with Transformers for Social Group Activity Recognition (用Transformers寻找群体线索用于社会群体活动识别)

[1] PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification(PASS:用于人员重新识别的部分感知自我监督预训练)
paper | code

[1] GraphVid: It Only Takes a Few Nodes to Understand a Video (GraphVid:只需几个节点即可理解视频) (Oral)

[6] Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding (打乱的视频是否有益于时间偏差问题:一种新的时间接地训练框架)

paper |code

[5] Feature Representation Learning for Unsupervised Cross-domain Image Retrieval (无监督跨域图像检索的特征表示学习)
paper | code

[4] LocVTP: Video-Text Pre-training for Temporal Localization (LocVTP:时间定位的视频文本预训练)
paper | code

[3] Deep Hash Distillation for Image Retrieval (用于图像检索的深度哈希蒸馏)
paper | code

[2] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval (TS2-Net:用于文本视频检索的令牌移位和选择转换器)
paper | code

[1] Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval (轻量级注意力特征融合:文本到视频检索的新基线)

[1] Deep 360∘ Optical Flow Estimation Based on Multi-Projection Fusion (基于多投影融合的深度360∘光流估计)


[4] Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction (被忽视的姿势实际上是有意义的:为人体运动预测提炼特权知识)


[3] 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal (通过手部去遮挡和移除的 3D 交互手部姿势估计)

paper | code

[2] Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration (基于隐式空间校准的 Transformer 的弱监督目标定位)
[paper] ( | code

[1] Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks (使用自监督深度先验变形网络的类别级 6D 对象姿势和大小估计)
paper | code

[1] Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches ((使用最优对抗补丁对单目深度估计进行物理攻击))

[1] Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation (通过场景消歧实现种族无偏肤色估计)

paper | code

[1] MoFaNeRF: Morphable Facial Neural Radiance Field (MoFaNeRF:可变形面部神经辐射场)

paper |code

[1] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras (DiffuStereo:使用稀疏相机通过基于扩散的立体进行高质量人体重建)

[1] Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields (Sem2NeRF:将单视图语义掩码转换为神经辐射场)
paper | code

[2] Tracking Every Thing in the Wild (追踪野外的每一件事)


[1] Towards Grand Unification of Object Tracking (迈向目标跟踪的大统一) (Oral)
paper | code

[5] Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (了解艺术字:用于场景文本识别的角引导转换器) (Oral)

paper | code

[4] Contextual Text Block Detection towards Scene Text Understanding (面向场景文本理解的上下文文本块检测)


[3] PromptDet: Towards Open-vocabulary Detection using Uncurated Images (PromptDet:使用未经处理的图像进行开放词汇检测)
paper |code

[2] End-to-End Video Text Spotting with Transformer (使用 Transformer 的端到端视频文本定位) (Oral)
paper | code

[1] Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting (用于经济高效的端到端文本定位的动态低分辨率蒸馏)
paper | code

[7] Learning Energy-Based Models With Adversarial Training (通过对抗训练学习基于能量的模型)

paper | code

[6] Adaptive Image Transformations for Transfer-based Adversarial Attack (基于传输的对抗性攻击的自适应图像转换)

[5] Generative Multiplane Images: Making a 2D GAN 3D-Aware (生成多平面图像:让一个2D GAN变得3D感知)
paper | code

[4] Eliminating Gradient Conflict in Reference-based Line-Art Colorization (消除基于参考的艺术线条着色中的梯度冲突)
paper | code

[3] WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation (WaveGAN:用于高保真少镜头图像生成的频率感知 GAN)
paper | code

[2] FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs (FakeCLR:探索对比学习以解决数据高效 GAN 中的潜在不连续性)
paper | code

[1] UniCR: Universally Approximated Certified Robustness via Randomized Smoothing (UniCR:通过随机平滑获得普遍近似的认证鲁棒性)

[1] PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation (PixelFolder:用于图像生成的高效渐进式像素合成网络)

paper | code

[1] D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights (D2-TPred:交通灯下轨迹预测的不连续依赖)
paper | code

[1] Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips (使用 Bit Flips 对神经网络进行难以察觉的特洛伊木马攻击)


[1] PalQuant: Accelerating High-precision Networks on Low-precision Accelerators (PalQuant:在低精度加速器上加速高精度网络)

paper | code

[5] Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding (用于长期 4D 点云视频理解的 Point Primitive Transformer)


[4] Improving Vision Transformers by Revisiting High-frequency Components (通过重新审视高频组件来改进视觉变压器)

paper | code

[3] Transformer with Implicit Edges for Particle-based Physics Simulation (用于基于粒子的物理模拟的隐式边缘变压器)

paper | code

[2] ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer (ScalableViT:重新思考 Vision Transformer 面向上下文的泛化)
paper | code

[1] Visual Prompt Tuning (视觉提示调整)
paper | code

[3] ScaleNet: Searching for the Model to Scale (ScaleNet:搜索要扩展的模型)
paper | code

[2] Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning (集成知识引导的子网络搜索和过滤器修剪微调)
paper | code

[1] EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs (EAGAN:GAN 的高效两阶段进化架构搜索)
paper | code

[1] Fine-grained Data Distribution Alignment for Post-Training Quantization (训练后量化的细粒度数据分布对齐) (Oral)
paper | code

[1] Content-Oriented Learned Image Compression (面向内容的学习图像压缩)


[1] Unsupervised Deep Multi-Shape Matching (无监督深度多形状匹配)

[1] Object-Compositional Neural Implicit Surfaces (对象组合神经隐式曲面)
paper | code

[1] Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection (通过有效的转移矩阵估计学习噪声标签以对抗标签错误校正)

[2] Long-tailed Instance Segmentation using Gumbel Optimized Loss (使用 Gumbel 优化损失的长尾实例分割)

paper | code

[1] Identifying Hard Noise in Long-Tailed Sample Distribution (识别长尾样本分布中的硬噪声) (Oral)


[3] Prune Your Model Before Distill It (在蒸馏之前修剪你的模型)


[2] Efficient One Pass Self-distillation with Zipf's Label Smoothing (使用 Zipf 的标签平滑实现高效的单程自蒸馏)

paper | code

[1] Knowledge Condensation Distillation (知识浓缩蒸馏)
paper | code

[1] Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting (多模式车辆轨迹预测的分层潜在结构)
paper | code

[1] Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels (中心性和一致性:使用实例相关的噪声标签进行学习的两阶段清洁样本识别)

paper | code

[8] Acknowledging the Unknown for Multi-label Learning with Single Positive Labels (用单个正标签承认未知的多标签学习)

paper | code

[7] W2N:Switching From Weak Supervision to Noisy Supervision for Object Detection (W2N:目标检测从弱监督切换到嘈杂监督)

paper | code

[6] CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation (CA-SSL:用于检测和分割的与类别无关的半监督学习)
paper | code

[5] FedX: Unsupervised Federated Learning with Cross Knowledge Distillation (FedX:具有交叉知识蒸馏的无监督联合学习)

[4] Synergistic Self-supervised and Quantization Learning (协同自监督和量化学习)
paper | code

[3] Contrastive Deep Supervision (对比深度监督)
paper | code

[2] Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection (稠密教师:用于半监督目标检测的稠密伪标签)

[1] Image Coding for Machines with Omnipotent Feature Learning (具有全能特征学习的机器的图像编码)

[2] Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting (语言问题:用于场景文本检测和识别的弱监督视觉语言预训练方法) (Oral)


[1] Contrastive Vision-Language Pre-training with Limited Resources (资源有限的对比视觉语言预训练)
paper | code

[1] Cross-modal Prototype Driven Network for Radiology Report Generation (用于放射学报告生成的跨模式原型驱动网络)
paper | code

[2] Worst Case Matters for Few-Shot Recognition (最坏情况对少数镜头识别很重要)

paper | code

[1] Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning (用于少数镜头学习的学习实例和任务感知动态内核)

[2] Balancing Stability and Plasticity through Advanced Null Space in Continual Learning (通过持续学习中的高级零空间平衡稳定性和可塑性) (Oral)


[1] Online Continual Learning with Contrastive Vision Transformer (使用对比视觉转换器进行在线持续学习)


[2] Factorizing Knowledge in Neural Networks (在神经网络中分解知识)
paper | code

[1] CycDA: Unsupervised Cycle Domain Adaptation from Image to Video (CycDA:从图像到视频的无监督循环域自适应)

[1] Target-absent Human Attention (目标缺失——人类注意力缺失)
paper | code

[1] Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction (通过残差动作预测解决视觉模仿学习中的模仿问题)

【1】文字解读:ECCV 2022 Oral | Unicorn:迈向目标跟踪的大统一
直播解读:极市直播丨严彬-Unicorn:走向目标跟踪的大一统(ECCV2022 Oral)

【2】ECCV 2022 Oral | 无需微调即可泛化!RegAD:少样本异常检测新框架

【3】ECCV 2022 | Poseur:你以为我是姿态估计,其实是目标检测

【4】ECCV 2022 | 清华&腾讯AI Lab提出REALY: 重新思考3D人脸重建的评估方法

【5】ECCV 2022 | AirDet: 无需微调的小样本目标检测方法

【6】ECCV2022 | 重新思考单阶段3D目标检测中的IoU优化

【7】ECCV 2022 | 通往数据高效的Transformer目标检测器

【8】ECCV2022 | FPN错位对齐,实现高效半监督目标检测 (PseCo)

【9】ECCV 2022 | SmoothNet:用神经网络代替平滑滤波器,不用重新训练才配叫“即插即用”

【10】ECCV2022 Oral | 无需前置条件的自动着色算法


【12】ECCV 2022 | Masked Generative Distillation: 适用于分类,检测,分割的生成式知识蒸馏

【13】ECCV 2022 | 多域长尾分布学习,不平衡域泛化问题研究

【14】ECCV2022 | DisCo: 提升轻量化模型在自监督学习中的效果

【15】ECCV2022 最新综述 | 面向大规模场景的小目标检测:综述和 benchmark

【16】ECCV2022 | 京东&北航&美团提出时序动作检测新框架 性能SOTA!

【17】ECCV2022 | 你没见过的《老友记》镜头,AI给补出来了

【18】ECCV2022 | 重新思考单阶段3D目标检测中的IoU优化

【19】ECCV2022 | FPN错位对齐,实现高效半监督目标检测 (PseCo)

【20】ECCV2022|CV核心特征分解用于批处理矩阵 | 中小型矩阵的批量高效(batch-efficient)特征分解

【21】ECCV22 | CMU提出首个快速知识蒸馏的视觉框架:80.1%精度,训练加速30%

【22】ECCV 22|首个360°全景定制的单目深度估计Transformer-PanoFormer



【25】ECCV 2022 Oral|CCPL: 一种通用的关联性保留损失函数实现通用风格迁移

【26】ECCV 2022 | 仅用全连接层处理视频数据,美图&NUS实现高效视频时空建模

【27】ECCV 2022|计算机视觉中的长尾分布问题还值得做吗

【28】ECCV 22|大数据的红利我吃定了!微软开源TinyViT :搞定小模型的预训练能力

【29】ECCV 2022 Oral | 满分论文!视频实例分割新SOTA:SeqFormer & IDOL

【30】ECCV 2022 Oral|自反馈学习的mixup训练框架AutoMix