历年综述论文分类汇总戳这里↘️ CV-Surveys施工中~~~~~~~~~~
- Multi-view Tracking Using Weakly Supervised Human Motion Prediction
⭐code - Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
⭐code - GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction
- Audio Visual Event Localization视听事件定位
- 音频去噪
- 视听分割
- Line Search-Based Feature Transformation for Fast, Stable, and Tunable Content-Style Control in Photorealistic Style Transfer
⭐code
- Learning Across Domains and Devices: Style-Driven Source-Free Domain Adaptation in Clustered Federated Learning
⭐code
- VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models
⭐code - Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision
- iris localization(虹膜定位)
- Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation
⭐code
- DRAMA: Joint Risk Localization and Captioning in Driving
- VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
⭐code - VideoQA
- Continual Learning with Dependency Preserving Hypernetworks
- Do Pre-trained Models Benefit Equally in Continual Learning
⭐code
- 长尾识别
- pen-Set Classification
- Boosting vision transformers for image retrieval
⭐code - 图像-句子检索
- 图像-文本检索
- 动作识别
- EmbryosFormer: Deformable Transformer and Collaborative Encoding-Decoding for Embryos Stage Development Classification
⭐code - Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
⭐code - Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets
⭐code
- 剪枝
- 知识蒸馏
- OCR-VQGAN: Taming Text-within-Image Generation
⭐code - Efficient few-shot learning for pixel-precise handwritten document layout analysis
- 文本识别
- One-Shot Synthesis of Images and Segmentation Masks
⭐code - Style-Guided Inference of Transformer for High-resolution Image Synthesis
- 图像生成
- 自监督
- 半监督
- VOS
- VSS
- 语义分割
- BEV segmentation
- 全景分割
- 小样本分割
- 域适应
- 域泛化
- My Face My Choice: Privacy Enhancing Deepfakes for Social Media Anonymization
- 人脸识别
- 人脸交换
- 读唇术
- 人脸恢复
- 人脸表情识别
- 人脸重现
- 基于表情的脸部皱纹合成
- 人脸命名
- 图像恢复
- 图像增强
- 图像着色
- HDR重构
- 多人姿态估计
- 三维人体
- 手部重建
- 视频理解
- 多人检测
- 场景识别
- Video Grounding
- 视频异常检测(VAD)
- 图像视频编解码
- ConfMix: Unsupervised Domain Adaptation for Object Detection via Confidence-based Mixing
⭐code - Domain Adaptive Object Detection for Autonomous Driving under Foggy Weather
⭐code - ROMA: Run-Time Object Detection To Maximize Real-Time Accuracy
- VOD
- OOD
- 伪装目标检测
- 目标发现
- 3D目标检测
- HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks
- fashion attribute editing(时尚属性编辑)
- Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data
⭐code - Seg&Struct: The Interplay Between Part Segmentation and Structure Inference for 3D Shape Parsing
- 深度估计
- MVS
- RGB-D重建
- Stereo Matching
- 胸部X光分类
- CT图像融合
- Instance-Dependent Noisy Label Learning via Graphical Modelling
- Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation
- TeST: Test-time Self-Training under Distribution Shift
- Simultaneous Acquisition of High Quality RGB Image and Polarization Information using a Sparse Polarization Sensor
⭐code - Enabling ISP-less Low-Power Computer Vision
- AdaNorm: Adaptive Gradient Norm Correction based Optimizer for CNNs
⭐code - Composite Learning for Robust and Effective Dense Predictions
- SAILOR: Scaling Anchors via Insights into Latent Object
⭐code - Modeling the Lighting in Scenes as Style for Auto White-Balance Correction
⭐code - DE-CROP: Data-efficient Certified Robustness for Pretrained Classifiers
🏠project - Anisotropic Multi-Scale Graph Convolutional Network for Dense Shape Correspondence
- ATCON: Attention Consistency for Vision Models
⭐code - LAVA: Label-efficient Visual Learning and Adaptation
- Interpolated SelectionConv for Spherical Images and Surfaces
- Augmentation by Counterfactual Explanation -- Fixing an Overconfident Classifier
- BNN