A Collection of Papers and Codes in CVPR2023 related to Low-Level Vision
[In Construction] If you find some missing papers or typos, feel free to pull issues or requests.
- Awesome-ICCV2023/2021-Low-Level-Vision
- Awesome-CVPR2022-Low-Level-Vision
- Awesome-NeurIPS2022/2021-Low-Level-Vision
- Awesome-ECCV2022-Low-Level-Vision
- Awesome-AAAI2022-Low-Level-Vision
- Awesome-CVPR2021/2020-Low-Level-Vision
- Awesome-ECCV2020-Low-Level-Vision
Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
- Paper: https://arxiv.org/abs/2303.00748
- Code: https://github.com/ofsoundof/GRL-Image-Restoration
- Tags: Transformer
Comprehensive and Delicate: An Efficient Transformer for Image Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhao_Comprehensive_and_Delicate_An_Efficient_Transformer_for_Image_Restoration_CVPR_2023_paper.html
- Tags: Transformer
Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective
Generative Diffusion Prior for Unified Image Restoration and Enhancement
DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration
- Paper: https://arxiv.org/abs/2303.06885
- Tags: Diffusion, Blind Face
Bitstream-Corrupted JPEG Images are Restorable: Two-stage Compensation and Alignment Framework for Image Restoration
All-in-One Image Restoration for Unknown Degradations Using Adaptive Discriminative Filters for Specific Degradations
Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhu_Learning_Weather-General_and_Weather-Specific_Features_for_Image_Restoration_Under_Multiple_CVPR_2023_paper.html
- Code: https://github.com/zhuyr97/WGWS-Net
- Tags: Multiple Adverse Weather
AccelIR: Task-Aware Image Compression for Accelerating Neural Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Ye_AccelIR_Task-Aware_Image_Compression_for_Accelerating_Neural_Restoration_CVPR_2023_paper.html
- Tags: Image Compression for Accelerating
Robust Unsupervised StyleGAN Image Restoration
- Paper: https://arxiv.org/abs/2302.06733
- Tags: StyleGAN
Ingredient-Oriented Multi-Degradation Learning for Image Restoration
Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
- Paper: https://arxiv.org/abs/2303.09101
- Code: https://github.com/Huang-ShiRui/Semi-UIR
- Tags: Underwater Image Restoration
Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior
- Paper: https://arxiv.org/abs/2303.15046
- Code: https://github.com/ykdai/BracketFlare
- Tags: Reflective Flare Removal
Robust Single Image Reflection Removal Against Adversarial Attacks
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Song_Robust_Single_Image_Reflection_Removal_Against_Adversarial_Attacks_CVPR_2023_paper.html
- Tags: Reflection Removal
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
- Paper: https://arxiv.org/abs/2212.04711
- Code: https://github.com/GuoLanqing/ShadowDiffusion
- Tags: Diffusion, Shadow Removal
Document Image Shadow Removal Guided by Color-Aware Background
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhang_Document_Image_Shadow_Removal_Guided_by_Color-Aware_Background_CVPR_2023_paper.html
- Code: https://github.com/hyyh1314/BGShadowNet
- Tags: Shadow Removal
Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera
GamutMLP: A Lightweight MLP for Color Loss Recovery
- Paper: https://arxiv.org/abs/2304.11743
- Code: https://github.com/hminle/gamut-mlp
- Tags: restore wide-gamut color values
ABCD: Arbitrary Bitwise Coefficient for De-Quantization
- Paper: https://openaccess.thecvf.com/content/CVPR2023/papers/Han_ABCD_Arbitrary_Bitwise_Coefficient_for_De-Quantization_CVPR_2023_paper.pdf
- Code: https://github.com/WooKyoungHan/ABCD
- Tags: De-quantization/Bit depth expansion
Visual Recognition-Driven Image Restoration for Multiple Degradation With Intrinsic Semantics Recovery
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Yang_Visual_Recognition-Driven_Image_Restoration_for_Multiple_Degradation_With_Intrinsic_Semantics_CVPR_2023_paper.html
- Tags: Restoration for High-Level Tasks
Parallel Diffusion Models of Operator and Image for Blind Inverse Problems
- Paper: https://arxiv.org/abs/2211.10656
- Code: https://github.com/BlindDPS/blind-dps
- Tags: blind deblurring, and imaging through turbulence
Raw Image Reconstruction with Learned Compact Metadata
High-resolution image reconstruction with latent diffusion models from human brain activity
- Paper: https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2
- Code: https://github.com/yu-takagi/StableDiffusionReconstruction
Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder
Optimization-Inspired Cross-Attention Transformer for Compressive Sensing
- Paper: https://arxiv.org/abs/2304.13986
- Code: https://github.com/songjiechong/OCTUF
- Tags: Compressive Sensing
Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding
Burstormer: Burst Image Restoration and Enhancement Transformer
Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement
A Simple Baseline for Video Restoration with Grouped Spatial-temporal Shift
Blind Video Deflickering by Neural Filtering with a Flawed Atlas
- Paper: https://arxiv.org/abs/2303.08120
- Code: https://github.com/ChenyangLEI/All-In-One-Deflicker
- Tags: Deflickering
Activating More Pixels in Image Super-Resolution Transformer
- Paper: https://arxiv.org/abs/2205.04437
- Code: https://github.com/XPixelGroup/HAT
- Tags: Transformer
N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
Omni Aggregation Networks for Lightweight Image Super-Resolution
OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution
- Paper: https://arxiv.org/abs/2303.01091
- Tags: Arbitrary-Scale SR
Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution
- Paper: https://arxiv.org/abs/2303.05156
- Tags: Normalizing Flow, Arbitrary-Scale SR
Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution
- Paper: https://arxiv.org/abs/2303.16513
- Code: https://github.com/jaroslaw1007/CLIT
- Tags: Arbitrary-Scale SR, Transformer
Deep Arbitrary-Scale Image Super-Resolution via Scale-Equivariance Pursuit
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Wang_Deep_Arbitrary-Scale_Image_Super-Resolution_via_Scale-Equivariance_Pursuit_CVPR_2023_paper.html
- Code: https://github.com/neuralchen/EQSR
- Tags: Arbitrary-Scale SR
CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution
- Paper: https://arxiv.org/abs/2212.04362
- Tags: Arbitrary-Scale SR
Super-Resolution Neural Operator
- Paper: https://arxiv.org/abs/2303.02584
- Code: https://github.com/2y7c3/Super-Resolution-Neural-Operator
Human Guided Ground-truth Generation for Realistic Image Super-resolution
Better "CMOS" Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution
- Paper: https://arxiv.org/abs/2304.03542
- Tags: Blind
Implicit Diffusion Models for Continuous Super-Resolution
- Paper: https://arxiv.org/abs/2303.16491
- Code: https://github.com/Ree1s/IDM
- Tags: Diffusion
CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large Input
Spectral Bayesian Uncertainty for Image Super-Resolution
Cross-Guided Optimization of Radiance Fields With Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
Image Super-Resolution Using T-Tetromino Pixels
Memory-Friendly Scalable Super-Resolution via Rewinding Lottery Ticket Hypothesis
Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Chao_Equivalent_Transformation_and_Dual_Stream_Network_Construction_for_Mobile_Image_CVPR_2023_paper.html
- Code: https://github.com/ECNUSR/ETDS
Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation
OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer
- Paper: https://arxiv.org/abs/2302.03453
- Code: https://github.com/Fanghua-Yu/OSRT
- Tags: Transformer, Omnidirectional SR
B-Spline Texture Coefficients Estimator for Screen Content Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Pak_B-Spline_Texture_Coefficients_Estimator_for_Screen_Content_Image_Super-Resolution_CVPR_2023_paper.html
- Code: https://github.com/ByeongHyunPak/btc
Spatial-Frequency Mutual Learning for Face Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Wang_Spatial-Frequency_Mutual_Learning_for_Face_Super-Resolution_CVPR_2023_paper.html
- Code: https://github.com/wcy-cs/SFMNet
- Tags: Face SR
Learning Generative Structure Prior for Blind Text Image Super-resolution
- Paper: https://arxiv.org/abs/2303.14726
- Code: https://github.com/csxmli2016/MARCONet
- Tags: Text SR
Guided Depth Super-Resolution by Deep Anisotropic Diffusion
- Paper: https://arxiv.org/abs/2211.11592
- Code: https://github.com/prs-eth/Diffusion-Super-Resolution
- Tags: Guided Depth SR
Toward Stable, Interpretable, and Lightweight Hyperspectral Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Xie_Toward_Stable_Interpretable_and_Lightweight_Hyperspectral_Super-Resolution_CVPR_2023_paper.html
- Code: https://github.com/WenjinGuo/DAEM
- Tags: Hyperspectral SR
Zero-Shot Dual-Lens Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Xu_Zero-Shot_Dual-Lens_Super-Resolution_CVPR_2023_paper.html
- Code: https://github.com/XrKang/ZeDuSR
Probability-based Global Cross-modal Upsampling for Pansharpening
- Paper: https://arxiv.org/abs/2303.13659
- Code: https://github.com/Zeyu-Zhu/PGCU
- Tags: Pansharpening(for remote sensing image)
CutMIB: Boosting Light Field Super-Resolution via Multi-View Image Blending
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Xiao_CutMIB_Boosting_Light_Field_Super-Resolution_via_Multi-View_Image_Blending_CVPR_2023_paper.html
- Code: https://github.com/zeyuxiao1997/CutMIB
- Tags: Light Field SR
Quantum Annealing for Single Image Super-Resolution
- Paper: https://arxiv.org/abs/2304.08924
- Tags: [Workshop]
Bicubic++: Slim, Slimmer, Slimmest -- Designing an Industry-Grade Super-Resolution Network
- Paper: https://arxiv.org/abs/2305.02126
- Code: https://github.com/aselsan-research-imaging-team/bicubic-plusplus
- Tags: [Workshop]
Hybrid Transformer and CNN Attention Network for Stereo Image Super-resolution
- Paper: https://arxiv.org/abs/2305.05177
- Tags: [Workshop]
Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
Structured Sparsity Learning for Efficient Video Super-Resolution
Compression-Aware Video Super-Resolution
Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution
- Paper: https://arxiv.org/abs/2303.13767
- Project: https://vlis2022.github.io/cvpr23/egvsr
- Tags: Event
Consistent Direct Time-of-Flight Video Depth Super-Resolution
- Paper: https://arxiv.org/abs/2211.08658
- Code: https://github.com/facebookresearch/DVSR/
- Tags: Depth SR
HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization
DINN360: Deformable Invertible Neural Network for Latitude-Aware 360deg Image Rescaling
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Guo_DINN360_Deformable_Invertible_Neural_Network_for_Latitude-Aware_360deg_Image_Rescaling_CVPR_2023_paper.html
- Code: https://github.com/gyc9709/DINN360
Masked Image Training for Generalizable Deep Image Denoising
Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising
- Paper: https://arxiv.org/abs/2303.14934
- Cdoe: https://github.com/nagejacob/SpatiallyAdaptiveSSID
- Tags: Self-Supervised
LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising
- Paper: https://arxiv.org/abs/2304.00534
- Code: https://github.com/Wang-XIaoDingdd/LGBPN
- Tags: Self-Supervised
Real-time Controllable Denoising for Image and Video
Zero-Shot Noise2Noise: Efficient Image Denoising without any Data
- Paper: https://arxiv.org/abs/2303.11253
- Code: https://colab.research.google.com/drive/1i82nyizTdszyHkaHBuKPbWnTzao8HF9b
- Tags: Zero-Shot
Patch-Craft Self-Supervised Training for Correlated Image Denoising
- Paper: https://arxiv.org/abs/2211.09919
- Tags: Self-Supervised
sRGB Real Noise Synthesizing with Neighboring Correlation-Aware Noise Model
- Paper: https://openaccess.thecvf.com/content/CVPR2023/papers/Fu_sRGB_Real_Noise_Synthesizing_With_Neighboring_Correlation-Aware_Noise_Model_CVPR_2023_paper.pdf
- Code: https://github.com/xuan611/sRGB-Real-Noise-Synthesizing
- Tags: Real Noise Synthesizing
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising
- Paper: https://arxiv.org/abs/2304.00844
- Code: https://github.com/MyuLi/SERT
- Tags: Hyperspectral
Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations
- Paper: https://arxiv.org/abs/2303.18139
- Tags: 3D
Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Sheng_Structure_Aggregation_for_Cross-Spectral_Stereo_Image_Guided_Denoising_CVPR_2023_paper.html
- Code: https://github.com/lustrouselixir/SANet
- Tags: Stereo Image
Polarized Color Image Denoising
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Li_Polarized_Color_Image_Denoising_CVPR_2023_paper.html
- Code: https://github.com/bandasyou/pcdenoise
- Tags: Polarized Color Image
Structured Kernel Estimation for Photon-Limited Deconvolution
- Paper: https://arxiv.org/abs/2303.03472
- Code: https://github.com/sanghviyashiitb/structured-kernel-cvpr23
Blur Interpolation Transformer for Real-World Motion from Blur
Neumann Network with Recursive Kernels for Single Image Defocus Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Quan_Neumann_Network_With_Recursive_Kernels_for_Single_Image_Defocus_Deblurring_CVPR_2023_paper.html
- Code: https://github.com/csZcWu/NRKNet
Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur
Self-Supervised Non-Uniform Kernel Estimation With Flow-Based Motion Prior for Blind Image Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Fang_Self-Supervised_Non-Uniform_Kernel_Estimation_With_Flow-Based_Motion_Prior_for_Blind_CVPR_2023_paper.html
- Code: https://github.com/Fangzhenxuan/UFPDeblur
- Tag: Self-Supervised
Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior
- Paper: https://arxiv.org/abs/2210.05361
- Code: https://github.com/xl-tang01/UAUDeblur
- Tags: Unsupervised
K3DN: Disparity-Aware Kernel Estimation for Dual-Pixel Defocus Deblurring
Self-Supervised Blind Motion Deblurring With Deep Expectation Maximization
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Li_Self-Supervised_Blind_Motion_Deblurring_With_Deep_Expectation_Maximization_CVPR_2023_paper.html
- Tags: Self-Supervised
HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering
Deep Discriminative Spatial and Temporal Network for Efficient Video Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Pan_Deep_Discriminative_Spatial_and_Temporal_Network_for_Efficient_Video_Deblurring_CVPR_2023_paper.html
- Code: https://github.com/xuboming8/DSTNet
Learning A Sparse Transformer Network for Effective Image Deraining
SmartAssign: Learning a Smart Knowledge Assignment Strategy for Deraining and Desnowing
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Wang_SmartAssign_Learning_a_Smart_Knowledge_Assignment_Strategy_for_Deraining_and_CVPR_2023_paper.html
- Code: https://gitee.com/mindspore/models/tree/master/research/cv/SmartAssign
RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
Curricular Contrastive Regularization for Physics-aware Single Image Dehazing
Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing
- Paper: https://arxiv.org/abs/2304.08444
- Code: https://github.com/gy65896/SCANet
- Tags: [Workshop]
Streamlined Global and Local Features Combinator (SGLC) for High Resolution Image Dehazing
- Paper: https://arxiv.org/abs/2304.13375
- Tags: [Workshop]
Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders
A Unified HDR Imaging Method with Pixel and Patch Level
Inverting the Imaging Process by Learning an Implicit Camera Model
- Paper: https://arxiv.org/abs/2304.12748
- Code: https://github.com/xhuangcv/neucam
- Tags: generating all-in-focus photos & HDR imaging
Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Liu_Joint_HDR_Denoising_and_Fusion_A_Real-World_Mobile_HDR_Image_CVPR_2023_paper.html
- Code: https://github.com/shuaizhengliu/Joint-HDRDN
HDR Imaging with Spatially Varying Signal-to-Noise Ratios
1000 FPS HDR Video with a Spike-RGB Hybrid Camera
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
A Unified Pyramid Recurrent Network for Video Frame Interpolation
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
Exploring Discontinuity for Video Frame Interpolation
- Paper: https://arxiv.org/abs/2202.07291
- Code: https://github.com/pandatimo/Exploring-Discontinuity-for-VFI
Frame Interpolation Transformer and Uncertainty Guidance
Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation
Range-Nullspace Video Frame Interpolation With Focalized Motion Estimation
Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Kim_Event-Based_Video_Frame_Interpolation_With_Cross-Modal_Asymmetric_Bidirectional_Motion_Fields_CVPR_2023_paper.html
- Code: https://github.com/intelpro/CBMNet
- Tags: Event-based
Event-based Blurry Frame Interpolation under Blind Exposure
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Weng_Event-Based_Blurry_Frame_Interpolation_Under_Blind_Exposure_CVPR_2023_paper.html
- Code: https://github.com/WarranWeng/EBFI-BE
- Tags: Event-based
Event-Based Frame Interpolation with Ad-hoc Deblurring
- Paper: https://arxiv.org/abs/2301.05191
- Tags: Event-based
Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
- Paper: https://arxiv.org/abs/2303.15043
- Code: https://github.com/shangwei5/VIDUE
- Tags: Frame Interpolation and Deblurring
Realistic Saliency Guided Image Enhancement
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Miangoleh_Realistic_Saliency_Guided_Image_Enhancement_CVPR_2023_paper.html
- Code: https://github.com/compphoto/RealisticImageEnhancement
Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement
- Paper: https://arxiv.org/abs/2304.07039
- Code: https://github.com/langmanbusi/Semantic-Aware-Low-Light-Image-Enhancement
Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark
- Paper: https://arxiv.org/abs/2303.11642
- Code: https://github.com/MyNiuuu/VCSD
- Tags: NIR2RGB
DNF: Decouple and Feedback Network for Seeing in the Dark
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Jin_DNF_Decouple_and_Feedback_Network_for_Seeing_in_the_Dark_CVPR_2023_paper.html
- Code: https://github.com/Srameo/DNF
You Do Not Need Additional Priors or Regularizers in Retinex-Based Low-Light Image Enhancement
Low-Light Image Enhancement via Structure Modeling and Guidance
Learning a Simple Low-light Image Enhancer from Paired Low-light Instances
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Fu_Learning_a_Simple_Low-Light_Image_Enhancer_From_Paired_Low-Light_Instances_CVPR_2023_paper.html
- Code: https://github.com/zhenqifu/pairlie
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Semi-supervised Parametric Real-world Image Harmonization
- Paper: https://arxiv.org/abs/2303.00157
- Project: https://kewang0622.github.io/sprih/
PCT-Net: Full Resolution Image Harmonization Using Pixel-Wise Color Transformations
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Guerreiro_PCT-Net_Full_Resolution_Image_Harmonization_Using_Pixel-Wise_Color_Transformations_CVPR_2023_paper.html
- Code: https://github.com/rakutentech/PCT-Net-Image-Harmonization/
ObjectStitch: Object Compositing With Diffusion Model
NUWA-LIP: Language-Guided Image Inpainting With Defect-Free VQGAN
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Ni_NUWA-LIP_Language-Guided_Image_Inpainting_With_Defect-Free_VQGAN_CVPR_2023_paper.html
- Code: https://github.com/kodenii/NUWA-LIP
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
Semi-Supervised Video Inpainting with Cycle Consistency Constraints
Deep Stereo Video Inpainting
Referring Image Matting
Adaptive Human Matting for Dynamic Videos
Mask-Guided Matting in the Wild
End-to-End Video Matting With Trimap Propagation
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Huang_End-to-End_Video_Matting_With_Trimap_Propagation_CVPR_2023_paper.html
- Code: https://github.com/csvt32745/FTP-VM
Ultrahigh Resolution Image/Video Matting With Spatio-Temporal Sparsity
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
Context-based Trit-Plane Coding for Progressive Image Compression
Learned Image Compression with Mixed Transformer-CNN Architectures
NVTC: Nonlinear Vector Transform Coding
Multi-Realism Image Compression with a Conditional Generator
LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression
Neural Video Compression with Diverse Contexts
Video Compression With Entropy-Constrained Neural Representations
Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression
MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding
Motion Information Propagation for Neural Video Compression
Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding
- Paper:https://openaccess.thecvf.com/content/CVPR2023/html/Alexandre_Hierarchical_B-Frame_Video_Coding_Using_Two-Layer_CANF_Without_Motion_Coding_CVPR_2023_paper.html
- Code: https://github.com/nycu-clab/tlzmc-cvpr
Quality-aware Pre-trained Models for Blind Image Quality Assessment
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method
Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild
An Image Quality Assessment Dataset for Portraits
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhang_MD-VQA_Multi-Dimensional_Quality_Assessment_for_UGC_Live_Videos_CVPR_2023_paper.html
- Code: https://github.com/zzc-1998/MD-VQA
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Boutros_CR-FIQA_Face_Image_Quality_Assessment_by_Learning_Sample_Relative_Classifiability_CVPR_2023_paper.html
- Code: https://github.com/fdbtrs/CR-FIQA
SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement
- Paper: https://arxiv.org/abs/2305.08408
- Tags: [Workshop]
Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
Neural Preset for Color Style Transfer
CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
- Paper: https://arxiv.org/abs/2304.02744
- Project: https://stylegan-salon.github.io/
Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
- Paper: https://arxiv.org/abs/2304.04461
- Project: https://kaist-viclab.github.io/old-photo-modernization/
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
Learning Dynamic Style Kernels for Artistic Style Transfer
Inversion-Based Style Transfer with Diffusion Models
Imagic: Text-Based Real Image Editing with Diffusion Models
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
SIEDOB: Semantic Image Editing by Disentangling Object and Background
DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
Paint by Example: Exemplar-based Image Editing with Diffusion Models
StyleRes: Transforming the Residuals for Real Image Editing With StyleGAN
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
InstructPix2Pix: Learning to Follow Image Editing Instructions
Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model
Null-text Inversion for Editing Real Images using Guided Diffusion Models
- Paper: https://arxiv.org/abs/2211.09794
- Code: https://github.com/google/prompt-to-prompt/#null-text-inversion-for-editing-real-images
DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
Text-Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation
EDICT: Exact Diffusion Inversion via Coupled Transformations
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
- Paper: https://arxiv.org/abs/2212.02802
- Code: https://github.com/man805/Diffusion-Video-Autoencoders
Shape-aware Text-driven Layered Video Editing
- Paper: https://arxiv.org/abs/2301.13173
- Project: https://text-video-edit.github.io/#
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Scaling up GANs for Text-to-Image Synthesis
Variational Distribution Learning for Unsupervised Text-to-Image Generation
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
Shifted Diffusion for Text-to-image Generation
ReCo: Region-Controlled Text-to-Image Generation
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation With Natural Prompts
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Liu_RIATIG_Reliable_and_Imperceptible_Adversarial_Text-to-Image_Generation_With_Natural_Prompts_CVPR_2023_paper.html
- Code: https://github.com/WUSTL-CSPL/RIATIG
GLIGEN: Open-Set Grounded Text-to-Image Generation
Multi-Concept Customization of Text-to-Image Diffusion
ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model With Knowledge-Enhanced Mixture-of-Denoising-Experts
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
- Paper: https://arxiv.org/abs/2212.08698
- Code: https://github.com/UCSB-NLP-Chang/DiffusionDisentanglement
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Lu_Specialist_Diffusion_Plug-and-Play_Sample-Efficient_Fine-Tuning_of_Text-to-Image_Diffusion_Models_To_CVPR_2023_paper.html
- Code: https://github.com/Picsart-AI-Research/Specialist-Diffusion
MAGVLT: Masked Generative Vision-and-Language Transformer
Freestyle Layout-to-Image Synthesis
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
- Paper: https://arxiv.org/abs/2303.17490
- Project: https://sound2scene.github.io/
Collaborative Diffusion for Multi-Modal Face Generation and Editing
SpaText: Spatio-Textual Representation for Controllable Image Generation
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
- Paper: https://arxiv.org/abs/2211.17084
- Code: https://github.com/1jsingh/GradOP-Guided-Image-Synthesis
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
Person Image Synthesis via Denoising Diffusion Model
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
Fine-Grained Face Swapping via Regional GAN Inversion
Masked and Adaptive Transformer for Exemplar Based Image Translation
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
- Paper: https://arxiv.org/abs/2304.03119
- Code: https://github.com/Picsart-AI-Research/IPL-Zero-Shot-Generative-Model-Adaptation
StyleGene: Crossover and Mutation of Region-Level Facial Genes for Kinship Face Synthesis
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Li_StyleGene_Crossover_and_Mutation_of_Region-Level_Facial_Genes_for_Kinship_CVPR_2023_paper.html
- Code: https://github.com/CVI-SZU/StyleGene
Unpaired Image-to-Image Translation With Shortest Path Regularization
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Xie_Unpaired_Image-to-Image_Translation_With_Shortest_Path_Regularization_CVPR_2023_paper.html
- Code: https://github.com/Mid-Push/santa
BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models
MaskSketch: Unpaired Structure-guided Masked Image Generation
AdaptiveMix: Improving GAN Training via Feature Space Shrinkage
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Liu_AdaptiveMix_Improving_GAN_Training_via_Feature_Space_Shrinkage_CVPR_2023_paper.html
- Code: https://github.com/WentianZhang-ML/AdaptiveMix
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Regularized Vector Quantization for Tokenized Image Synthesis
Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
Post-training Quantization on Diffusion Models
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
DiffCollage: Parallel Generation of Large Content with Diffusion Models
Few-shot Semantic Image Synthesis with Class Affinity Transfer
NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs
DCFace: Synthetic Face Generation with Dual Condition Diffusion Model
Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
Class-Balancing Diffusion Models
Spider GAN: Leveraging Friendly Neighbors to Accelerate GAN Training
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization
- Paper: https://arxiv.org/abs/2305.11718
- Code: https://github.com/CrossmodalGroup/DynamicVectorQuantization
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
- Paper: https://arxiv.org/abs/2305.13607
- Code: https://github.com/CrossmodalGroup/MaskedVectorQuantization
Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis
GLeaD: Improving GANs with A Generator-Leading Task
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Bai_GLeaD_Improving_GANs_With_a_Generator-Leading_Task_CVPR_2023_paper.html
- Code: https://github.com/EzioBy/glead
Where Is My Spot? Few-Shot Image Generation via Latent Subspace Optimization
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zheng_Where_Is_My_Spot_Few-Shot_Image_Generation_via_Latent_Subspace_CVPR_2023_paper.html
- Code: https://github.com/chansey0529/LSO
KD-DLGAN: Data Limited Image Generation via Knowledge Distillation
Private Image Generation With Dual-Purpose Auxiliary Classifier
SceneComposer: Any-Level Semantic Image Synthesis
Exploring Intra-Class Variation Factors With Learnable Cluster Prompts for Semi-Supervised Image Synthesis
Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Saxena_Re-GAN_Data-Efficient_GANs_Training_via_Architectural_Reconfiguration_CVPR_2023_paper.html
- Code: https://github.com/IntellicentAI-Lab/Re-GAN
Discriminator-Cooperated Feature Map Distillation for GAN Compression
Wavelet Diffusion Models are fast and scalable Image Generators
On Distillation of Guided Diffusion Models
Binary Latent Diffusion
All are Worth Words: A ViT Backbone for Diffusion Models
Towards Practical Plug-and-Play Diffusion Models
Lookahead Diffusion Probabilistic Models for Refining Mean Estimation
Diffusion Probabilistic Model Made Slim
Self-Guided Diffusion Models
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Video Probabilistic Diffusion Models in Projected Latent Space
Decomposed Diffusion Models for High-Quality Video Generation
MoStGAN: Video Generation with Temporal Motion Styles
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Ruan_MM-Diffusion_Learning_Multi-Modal_Diffusion_Models_for_Joint_Audio_and_Video_CVPR_2023_paper.html
- Code: https://github.com/researchmm/MM-Diffusion
Dimensionality-Varying Diffusion Process
Perspective Fields for Single Image Camera Calibration
DC2: Dual-Camera Defocus Control by Learning to Refocus
- Paper: https://arxiv.org/abs/2304.03285
- Project: https://defocus-control.github.io/
Images Speak in Images: A Generalist Painter for In-Context Visual Learning
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
LightPainter: Interactive Portrait Relighting with Freehand Scribble
- Paper: https://arxiv.org/abs/2303.12950
- Tags: Portrait Relighting
Neural Texture Synthesis with Guided Correspondence
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Zhou_Neural_Texture_Synthesis_With_Guided_Correspondence_CVPR_2023_paper.html
- Code: https://github.com/EliotChenKJ/Guided-Correspondence-Loss
- Tags: Texture Synthesis
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
Large-capacity and Flexible Video Steganography via Invertible Neural Network
- Paper: https://arxiv.org/abs/2304.12300
- Code: https://github.com/MC-E/LF-VSN
- Tags: Steganography
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
- Paper: https://arxiv.org/abs/2304.14406
- Code: https://github.com/adobe-research/affordance-insertion
Controllable Light Diffusion for Portraits
- Paper: https://arxiv.org/abs/2305.04745
- Tags: Relighting
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
High-Fidelity and Freely Controllable Talking Head Video Generation
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Identity-Preserving Talking Face Generation with Landmark and Appearance Priors
LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
Linking Garment With Person via Semantically Associated Landmarks for Virtual Try-On
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Yan_Linking_Garment_With_Person_via_Semantically_Associated_Landmarks_for_Virtual_CVPR_2023_paper.html
- Code: https://modelscope.cn/datasets/damo/SAL-HG/summary
TryOnDiffusion: A Tale of Two UNets
CF-Font: Content Fusion for Few-shot Font Generation
- Paper: https://arxiv.org/abs/2303.14017
- Code: https://github.com/wangchi95/CF-Font
- Tags: Font Generation
Neural Transformation Fields for Arbitrary-Styled Font Generation
- Paper: https://openaccess.thecvf.com/content/CVPR2023/html/Fu_Neural_Transformation_Fields_for_Arbitrary-Styled_Font_Generation_CVPR_2023_paper.html
- Code: https://github.com/fubinfb/NTF
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
Handwritten Text Generation from Visual Archetypes
- Paper: https://arxiv.org/abs/2303.15269
- Tags: Handwriting Generation
Disentangling Writer and Character Styles for Handwriting Generation
- Paper: https://arxiv.org/abs/2303.14736
- Code: https://github.com/dailenson/SDT
- Tags: Handwriting Generation
Conditional Text Image Generation With Diffusion Models
Unifying Layout Generation with a Decoupled Diffusion Model
Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation
PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout
- Paper: https://arxiv.org/abs/2303.15937
- Code: https://github.com/PKU-ICST-MIPL/PosterLayout-CVPR2023
LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
LayoutDM: Transformer-based Diffusion Model for Layout Generation