
Primary LanguagePythonMIT LicenseMIT

Low-level and High-level tasks

Low-level tasks: Common ones include Super-Resolution, denoise, deblur, dehze, low-light enhancement, deartifacts, etc. To put it simply, it is to restore a specific degraded image into a good-looking image. Now basically use the end-to-end model to learn the solution process of this kind of ill-posed problem. The objective indicators are mainly PSNR, SSIM, everyone The indicators are all set very high. Currently facing the following problems:

  • The generalization is poor. If you change the data set, the performance of the same task will be poor.
  • The existence of objective indicators and subjective feelings, GAP.
  • For the problem of landing, the SOTA model has a lot of computation (hundreds of G Flops), but it is actually impossible to use it like this.
  • It tends to solve practical problems, mainly serving people, such as various night scene modes and beautification in mobile phones, which will use related algorithms.
  • Low-level companies on the market are mostly mobile phone manufacturers (Huami OV), security ( Hikang Dahua), cameras (DJI, ISP manufacturers), drones (DJI), video websites (Bilibili, Kuaishou, etc.) ). Generally, scenes involving image and video enhancement are low-level trial problems.

High-level tasks: classification, detection, segmentation, etc. Generally, the public training data are high-quality images. When sending degraded images, the performance will decrease, even if the network has undergone a large amount of data enhancement (shape, brightness, chroma, etc. transformation). It is impossible for real application scenarios to be as perfect as the training set. There will be various degradation problems in the process of collecting images, and a combination of the two is required. In simple terms, the combination methods are divided into the following

  • Fine-tuning directly on the degraded image
  • First go through the low-level enhanced network, and then send it to the high-level model, and the two are trained separately
  • Joint training of augmented network and high-level models (such as classification)

Table of contents


Image Restoration – Image Restoration

Efficient and Explicit Modeling of Image Hierarchies for Image Restoration

Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective

Generative Diffusion Prior for Unified Image Restoration and Enhancement

Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank

Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior

Image Reconstruction

Raw Image Reconstruction with Learned Compact Metadata

High-resolution image reconstruction with latent diffusion models from human brain activity

DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

Burst Restoration

Burstormer: Burst Image Restoration and Enhancement Transformer

Video Restoration

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Super Resolution – super resolution

Image Super Resolution

Activating More Pixels in Image Super-Resolution Transformer

N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution

Omni Aggregation Networks for Lightweight Image Super-Resolution

OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution

Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution

Super-Resolution Neural Operator

Human Guided Ground-truth Generation for Realistic Image Super-resolution

Implicit Diffusion Models for Continuous Super-Resolution

Zero-Shot Dual-Lens Super-Resolution

Learning Generative Structure Prior for Blind Text Image Super-resolution

Guided Depth Super-Resolution by Deep Anisotropic Diffusion

Video Super Resolution

Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting

Structured Sparsity Learning for Efficient Video Super-Resolution

Image Rescaling – image scaling

HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization

Denoising – denoising

Image Denoising

Masked Image Training for Generalizable Deep Image Denoising

Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising

LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising

Real-time Controllable Denoising for Image and Video

Deblurring – Deblurring

Image Deblurring

Structured Kernel Estimation for Photon-Limited Deconvolution

Blur Interpolation Transformer for Real-World Motion from Blur

Neumann Network with Recursive Kernels for Single Image Defocus Deblurring

Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring

Deraining – deraining

Learning A Sparse Transformer Network for Effective Image Deraining

Dehazing – to fog

RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors

Curricular Contrastive Regularization for Physics-aware Single Image Dehazing

Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior

HDR Imaging / Multi-Exposure Image Fusion – HDR image generation / multi-exposure image fusion

Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models

Frame Interpolation – frame insertion

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

A Unified Pyramid Recurrent Network for Video Frame Interpolation

BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields

Event-based Blurry Frame Interpolation under Blind Exposure

Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time

Image Enhancement – ​​image enhancement

Low-Light Image Enhancement

Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement

Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark

Image Matting – image matting

Referring Image Matting

Shadow Removal – shadow removal

ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal

Image Compression – image compression

Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

Context-based Trit-Plane Coding for Progressive Image Compression

Learned Image Compression with Mixed Transformer-CNN Architectures

Video Compression

Neural Video Compression with Diverse Contexts

Image Quality Assessment – ​​image quality assessment

Quality-aware Pre-trained Models for Blind Image Quality Assessment

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method

Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild

Style Transfer – style transfer

Fix the Noise: Disentangling Source Feature for Controllable Domain Translation

Neural Preset for Color Style Transfer

CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer

Image Editing – image editing

Imagic: Text-Based Real Image Editing with Diffusion Models

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

SIEDOB: Semantic Image Editing by Disentangling Object and Background

Image Generation/Synthesis / Image-to-Image Translation – Image Generation/Synthesis/Translation

Text-to-Image / Text Guided / Multi-Modal

Multi-Concept Customization of Text-to-Image Diffusion

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

Scaling up GANs for Text-to-Image Synthesis

MAGVLT: Masked Generative Vision-and-Language Transformer

Freestyle Layout-to-Image Synthesis

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Image-to-Image / Image Guided

LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

Person Image Synthesis via Denoising Diffusion Model

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

Fine-Grained Face Swapping via Regional GAN ​​Inversion

Masked and Adaptive Transformer for Exemplar Based Image Translation

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

Others for image generation

AdaptiveMix: Robust Feature Representation via Shrinking Feature Space

MAGE: MASKed Generative Encoder to Unify Representation Learning and Image Synthesis

Regularized Vector Quantization for Tokenized Image Synthesis

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

Post-training Quantization on Diffusion Models

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

DiffCollage: Parallel Generation of Large Content with Diffusion Models

Few-shot Semantic Image Synthesis with Class Affinity Transfer

Video Generation

Conditional Image-to-Video Generation with Latent Flow Diffusion Models

Video Probabilistic Diffusion Models in Projected Latent Space

DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

Decomposed Diffusion Models for High-Quality Video Generation

Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding

MoStGAN: Video Generation with Temporal Motion Styles


DC2: Dual-Camera Defocus Control by Learning to Refocus

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

Unifying Layout Generation with a Decoupled Diffusion Model

Unsupervised Domain Adaptation with Pixel-level Discriminator for Image-aware Layout Generation

PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

LightPainter: Interactive Portrait Relighting with Freehand Scribble

Neural Texture Synthesis with Guided Correspondence

CF-Font: Content Fusion for Few-shot Font Generation

DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality

Handwritten Text Generation from Visual Archetypes

Disentangling Writer and Character Styles for Handwriting Generation

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Uncurated Image-Text Datasets: Shedding Light on Demographic Bias


Image Restoration – Image Restoration

Restorer: Efficient Transformer for High-Resolution Image Restoration

Uformer: A General U-Shaped Transformer for Image Restoration

MAXIM: Multi-Axis MLP for Image Processing

All-In-One Image Restoration for Unknown Corruption

Fourier Document Restoration for Robust Document Dewarping and Recognition

Exploring and Evaluating Image Restoration Potential in Dynamic Scenes

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior

Deep Generalized Unfolding Networks for Image Restoration

Attentive Fine-Grained Structured Sparsity for Image Restoration

Self-Supervised Deep Image Restoration via Adaptive Stochastic Gradient Langevin Dynamics

KNN Local Attention for Image Restoration

GIQE: Generic Image Quality Enhancement via Nth Order Iterative Degradation

TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions

Learning Multiple Adverse Weather Removal via Two-stage Knowledge Learning and Multi-contrastive Regularization: Toward a Unified Model

Rethinking Deep Face Restoration

RestoreFormer: High-Quality Blind Face Restoration From Ungraded Key-Value Pairs

Blind Face Restoration via Integrating Face Shape and Generative Priors

End-to-End Rubbing Restoration Using Generative Adversarial Networks

GenISP: Neural ISP for Low-Light Machine Cognition

Burst Restoration

A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

Burst Image Restoration and Enhancement

Video Restoration

Revisiting Temporal Alignment for Video Restoration

Neural Compression-Based Feature Learning for Video Restoration

Bringing Old Films Back to Life

Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature

Context-Aware Video Reconstruction for Rolling Shutter Cameras

E2V-SDE: From Asynchronous Events to Fast and Continuous Video Reconstruction via Neural Stochastic Differential Equations

Hyperspectral Image Reconstruction

Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction

HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging

Super Resolution – super resolution

Image Super Resolution

Reflash Dropout in Image Super-Resolution

Residual Local Feature Network for Efficient Super-Resolution

Learning the Degradation Distribution for Blind Image Super-Resolution

Deep Constrained Least Squares for Blind Image Super-Resolution

Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel

Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution

Dual Adversarial Adaptation for Cross-Device Real-World Image Super-Resolution

LAR-SR: A Local Autoregressive Model for Image Super-Resolution

Texture-Based Error Analysis for Image Super-Resolution

Learning to Zoom Inside Camera Imaging Pipeline

Task Decoupled Framework for Reference-Based Super-Resolution

GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Learning Graph Regularization for Guided Super-Resolution

Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution

Discrete Cosine Transform Network for Guided Depth Map Super-Resolution

SphereSR: 360deg Image Super-Resolution With Arbitrary Projection via Continuous Spherical Image Representation

IM Deception: Grouped Information Distilling Super-Resolution Network

A Closer Look at Blind Super-Resolution: Degradation Models, Baselines, and Performance Upper Bounds

Burst/Multi-frame Super Resolution

Self-Supervised Super-Resolution for Multi-Exposure Push-Frame Satellites

Video Super Resolution

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Learning Trajectory-Aware Transformer for Video Super-Resolution

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

Investigating Tradeoffs in Real-World Video Super-Resolution

Memory-Augmented Non-Local Attention for Video Super-Resolution

Stable Long-Term Recurrent Video Super-Resolution

Reference-based Video Super-Resolution Using Multi-Camera Video Triplets

A New Dataset and Transformer for Stereoscopic Video Super-Resolution

Image Rescaling – image scaling

Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence

Faithful Extreme Rescaling via Generative Prior Reciprocated Invertible Representations

Denoising – denoising

Image Denoising

Self-Supervised Image Denoising via Iterative Data Refinement

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots

AP-BSN: Self-Supervised Denoising for Real-World Images via Asymmetric PD and Blind-Spot Network

CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image

Noise Distribution Adaptive Self-Supervised Image Denoising Using Tweedie Distribution and Score Matching

Noise2NoiseFlow: Realistic Camera Noise Modeling without Clean Images

Modeling sRGB Camera Noise with Normalizing Flows

Estimating Fine-Grained Noise Model via Contrastive Learning

Multiple Degradation and Reconstruction Network for Single Image Denoising via Knowledge Distillation

Burst Denoising

NAN: Noise-Aware NeRFs for Burst-Denoising

Video Denoising

Dancing under the stars: video denoising in starlight

Deblurring – Deblurring

Image Deblurring

Learning to Deblur using Light Field Generated and Real Defocus Images

Pixel Screening Based Intermediate Correction for Blind Deblurring

Deblurring via Stochastic Refinement

XYDeblur: Divide and Conquer for Single Image Deblurring

Unifying Motion Deblurring and Frame Interpolation with Events

E-CIR: Event-Enhanced Continuous Intensity Recovery

Video Deblurring

Multi-Scale Memory-Based Video Deblurring

Deraining – deraining

Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond

Unpaired Deep Image Deraining Using Dual Contrastive Learning

Unsupervised Deraining: Where Contrastive Learning Meets Self-similarity

Dreaming To Prune Image Deraining Networks

Dehazing – to fog

Self-augmented Unpaired Image Dehazing via Density and Depth Decomposition

Towards Multi-Domain Single Image Dehazing via Test-Time Training

Image Dehazing Transformer With Transmission-Aware 3D Position Embedding

Physically Disentangled Intra- and Inter-Domain Adaptation for Varicolored Haze Removal

Demoireing – Go moiré

Video Demoireing with Relation-Based Temporal Consistency

Frame Interpolation – frame insertion

ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation

Long-term Video Frame Interpolation via Feature Propagation

Many-to-many Splatting for Efficient Video Frame Interpolation

Video Frame Interpolation with Transformer

Video Frame Interpolation Transformer

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation

Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Unifying Motion Deblurring and Frame Interpolation with Events

Multi-encoder Network for Parameter Reduction of a Kernel-based Interpolation Architecture

Spatial-Temporal Video Super-Resolution

RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution

Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning

VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution

Image Enhancement – ​​image enhancement

AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement

Exposure Correction Model to Enhance Image Quality

Low-Light Image Enhancement

Abandoning the Bayer-Filter to See in the Dark

Toward Fast, Flexible, and Robust Low-Light Image Enhancement

Deep Color Consistent Network for Low-Light Image Enhancement

SNR-Aware Low-Light Image Enhancement

URetinex-Net: Retinex-Based Deep Unfolding Network for Low-Light Image Enhancement

Image Harmonization – Image Harmonization

High-Resolution Image Harmonization via Collaborative Dual Transformationsg

SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization

Deep Image-based Illumination Harmonization

Image Completion/Inpainting – image restoration

Bridging Global Context Interactions for High-Fidelity Image Completion

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding

MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

Dual-Path Image Inpainting With Auxiliary GAN Inversion

SaiNet: Stereo aware inpainting behind objects with generative networks

Video Inpainting

Towards An End-to-End Framework for Flow-Guided Video Inpainting

The DEVIL Is in the Details: A Diagnostic Evaluation Benchmark for Video Inpainting

DLFormer: Discrete Latent Transformer for Video Inpainting

Inertia-Guided Flow Completion and Style Fusion for Video Inpainting

Image Matting – image matting

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

Human Instance Matting via Mutual Guidance and Multi-Instance Refinement

Boosting Robustness of Image Matting with Context Assembly and Strong Data Augmentation

Shadow Removal – shadow removal

Bijective Mapping Network for Shadow Removal


Face Relighting with Geometrically Consistent Shadows

SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision Tasks

Image Stitching – image stitching

Deep Rectangling for Image Stitching: A Learning Baseline

Automatic Color Image Stitching Using Quaternion Rank-1 Alignment

Geometric Structure Preserving Warp for Natural Image Stitching

Image Compression – image compression

Neural Data-Dependent Transform for Learned Image Compression

The Devil Is in the Details: Window-based Attention for Image Compression

ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding

Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression

DPICT: Deep Progressive Image Compression Using Trit-Planes

Joint Global and Local Hierarchical Priors for Learned Image Compression

LC-FDNet: Learned Lossless Image Compression With Frequency Decomposition Network

Practical Learned Lossless JPEG Recompression with Multi-Level Cross-Channel Entropy Model in the DCT Domain

SASIC: Stereo Image Compression With Latent Shifts and Stereo Attention

Deep Stereo Image Compression via Bi-Directional Coding

Learning Based Multi-Modality Image and Video Compression

PO-ELIC: Perception-Oriented Efficient Learned Image Coding

Video Compression

Coarse-to-fine Deep Video Coding with Hyperprior-guided Mode Prediction

LSVC: A Learning-Based Stereo Video Compression Framework

Enhancing VVC with Deep Learning based Multi-Frame Post-Processing

Image Quality Assessment – ​​image quality assessment

Personalized Image Aesthetics Assessment with Rich Attributes

Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment

SwinIQA: Learned Swin Distance for Compressed Image Quality Assessment

Image Decomposition

PIE-Net: Photometric Invariant Edge Guided Network for Intrinsic Image Decomposition

Deformable Sprites for Unsupervised Video Decomposition

Style Transfer – style transfer

CLIPstyler: Image Style Transfer with a Single Text Condition

Style-ERD: Responsive and Coherent Online Motion Style Transfer

Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation

StyTr2: Image Style Transfer With Transformers

PCA-Based Knowledge Distillation Towards Lightweight and Content-Style Balanced Photorealistic Style Transfer Models

Image Editing – image editing

High-Fidelity GAN Inversion for Image Attribute Editing

Style Transformer for Image Inversion and Editing

HairCLIP: Design Your Hair by Text and Reference Image

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

Blended Diffusion for Text-driven Editing of Natural Images

FlexIT: Towards Flexible Semantic Image Translation

SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

HyperInverter: Improving StyleGAN Inversion via Hypernetwork

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

Brain-Supervised Image Editing

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing

M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers

Image Generation/Synthesis / Image-to-Image Translation – Image Generation/Synthesis/Translation

Text-to-Image / Text Guided / Multi-Modal

Text to Image Generation with Semantic-Spatial Aware GAN

LAFITE: Towards Language-Free Training for Text-to-Image Generation

DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis

StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis

DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

Sound-Guided Semantic Image Manipulation

ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation

Text-to-Image Synthesis Based on Object-Guided Joint-Decoding Transformer

Vector Quantized Diffusion Model for Text-to-Image Synthesis

AnyFace: Free-style Text-to-Face Synthesis and Manipulation

Image-to-Image / Image Guided

Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation

A Style-aware Discriminator for Controllable Image Translation

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation

InstaFormer: Instance-Aware Image-to-Image Translation with Transformer

Marginal Contrastive Correspondence for Guided Image Generation

Unsupervised Image-to-Image Translation with Generative Prior

Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks

Neural Texture Extraction and Distribution for Controllable Person Image Synthesis

Unpaired Cartoon Image Synthesis via Gated Cycle Mapping

Day-to-Night Image Synthesis for Training Nighttime Neural ISPs

Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint

Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation

Self-Supervised Dense Consistency Regularization for Image-to-Image Translation

Drop the GAN: In Defense of Patches Nearest Neighbors as Single Image Generative Model

HairMapper: Removing Hair From Portraits Using GANs

Others for image generation

Attribute Group Editing for Reliable Few-shot Image Generation

Modulated Contrast for Versatile Image Synthesis

Interactive Image Synthesis with Panoptic Layout Generation

Autoregressive Image Generation using Residual Quantization

Dynamic Dual-Output Diffusion Models

Exploring Dual-task Correlation for Pose Guided Person Image Generation

StyleSwin: Transformer-based GAN for High-resolution Image Generation

Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis

Arbitrary-Scale Image Synthesis

InsetGAN for Full-Body Image Generation

HairMapper: Removing Hair from Portraits Using GANs

OSSGAN: Open-Set Semi-Supervised Image Generation

Retrieval-based Spatially Adaptive Normalization for Semantic Image Synthesis

A Closer Look at Few-shot Image Generation

Ensembling Off-the-shelf Models for GAN Training

Few-Shot Font Generation by Learning Fine-Grained Local Styles

Modeling Image Composition for Complex Scene Generation

Global Context With Discrete Diffusion in Vector Quantized Modeling for Image Generation

Self-supervised Correlation Mining Network for Person Image Generation

Learning To Memorize Feature Hallucination for One-Shot Image Generation

Local Attention Pyramid for Scene Image Generation

High-Resolution Image Synthesis with Latent Diffusion Models

Cluster-guided Image Synthesis with Unconditional Models

SphericGAN: Semi-Supervised Hyper-Spherical Generative Adversarial Networks for Fine-Grained Image Synthesis

DPGEN: Differentially Private Generative Energy-Guided Network for Natural Image Synthesis

DO-GAN: A Double Oracle Framework for Generative Adversarial Networks

Improving GAN Equilibrium by Raising Spatial Awareness

**Polymorphic-GAN: Generating Aligned Samples Across Multiple Domains With Learned Morph Maps**

Manifold Learning Benefits GANs

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data

On Conditioning the Input Noise for Controlled Image Generation with Diffusion Models

Generate and Edit Your Own Character in a Canonical View

StyLandGAN: A StyleGAN based Landscape Image Synthesis using Depth-map

Overparameterization Improves StyleGAN Inversion

Video Generation/Synthesis

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

Playable Environments: Video Manipulation in Space and Time

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

Thin-Plate Spline Motion Model for Image Animation

Make It Move: Controllable Image-to-Video Generation with Text Descriptions

Diverse Video Generation from a Single Video


GAN-Supervised Dense Visual Alignment

ClothFormer: Taming Video Virtual Try-on in All Module

Iterative Deep Homography Estimation

Style-Structure Disentangled Features and Normalizing Flows for Diverse Icon Colorization

Unsupervised Homography Estimation with Coplanarity-Aware GAN

Diverse Image Outpainting via GAN Inversion

On Aliased Resizing and Surprising Subtleties in GAN Evaluation

Patch-wise Contrastive Style Learning for Instagram Filter Removal


New Trends in Image Restoration and Enhancement workshop and challenges on image and video processing.

Spectral Reconstruction from RGB

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

Perceptual Image Quality Assessment: Track 1 Full-Reference / Track 2 No-Reference

MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment

Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion

Conformer and Blind Noisy Students for Improved Image Quality Assessment

Inpainting: Track 1 Unsupervised / Track 2 Semantic

GLaMa: Joint Spatial and Frequency Loss for General Image Inpainting

Efficient Super-Resolution

ShuffleMixer: An Efficient ConvNet for Image Super-Resolution

Edge-enhanced Feature Distillation Network for Efficient Super-Resolution

Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution

Blueprint Separable Residual Network for Efficient Image Super-Resolution

Night Photography Rendering

Rendering Nighttime Image Via Cascaded Color and Brightness Compensation

Super-Resolution and Quality Enhancement of Compressed Video: Track1 (Quality enhancement) / Track2 (Quality enhancement and x2 SR) / Track3 (Quality enhancement and x4 SR)

Progressive Training of A Two-Stage Framework for Video Restoration

High Dynamic Range (HDR): Track 1 Low-complexity (fidelity constraint) / Track 2 Fidelity (low-complexity constraint)

Efficient Progressive High Dynamic Range Image Restoration via Attention and Alignment Network

Stereo Super-Resolution

Parallel Interactive Transformer

Burst Super-Resolution: Track 2 Real

BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment


Image Restoration – Image Restoration

Simple Baselines for Image Restoration

D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration

Seeing Far in the Dark with Patterned Flash

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks

Improving Image Restoration by Revisiting Global Information Aggregation

Fast Two-step Blind Optical Aberration Correction

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

RAWtoBit: A Fully End-to-end Camera ISP Network

Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild

Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model

Modeling Mask Uncertainty in Hyperspectral Image Reconstruction

TAPE: Task-Agnostic Prior Embedding for Image Restoration

DRCNet: Dynamic Image Restoration Contrastive Network

ART-SS: An Adaptive Rejection Technique for Semi-Supervised Restoration for Adverse Weather-Affected Images

Spectrum-Aware and Transferable Architecture Search for Hyperspectral Image Restoration

Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration

JPEG Artifacts Removal via Contrastive Representation Learning

Zero-Shot Learning for Reflection Removal of Single 360-Degree Image

Overexposure Mask Fusion: Generalizable Reverse ISP Multi-Step Refinement

Video Restoration

Video Restoration Framework and Its Meta-Adaptations to Data-Poor Conditions

Super Resolution – super resolution

Image Super Resolution

ARM: Any-Time Super-Resolution Method

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

CADyQ : Contents-Aware Dynamic Quantization for Image Super Resolution

Image Super-Resolution with Deep Dictionary

Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution

Adaptive Patch Exiting for Scalable Single Image Super-Resolution

Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution

MuLUT: Cooperating Mulitple Look-Up Tables for Efficient Image Super-Resolution

Efficient Long-Range Attention Network for Image Super-resolution

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks

Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution

Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations

Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution

D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution

MM-RealSR: Metric Learning based Interactive Modulation for Real-World Super-Resolution

KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution

From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution

Unfolded Deep Kernel Estimation for Blind Image Super-Resolution

Uncertainty Learning in Kernel Estimation for Multi-stage Blind Image Super-Resolution

Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images

Reference-based Image Super-Resolution with Deformable Attention Transformer

RRSR: Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection

Boosting Event Stream Super-Resolution with a Recurrent Neural Network

HST: Hierarchical Swin Transformer for Compressed Image Super-resolution

Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration

Fast Nearest Convolution for Real-Time Efficient Image Super-Resolution

Video Super Resolution

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution

Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset

Denoising – denoising

Image Denoising

Deep Semantic Statistics Matching (D2SM) Denoising Network

Fast and High Quality Image Denoising via Malleable Convolution

Video Denoising

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-ahead Forward Ones

TempFormer: Temporally Consistent Transformer for Video Denoising

Deblurring – Deblurring

Image Deblurring

Learning Degradation Representations for Image Deblurring

Stripformer: Strip Transformer for Fast Image Deblurring

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

United Defocus Blur Detection and Deblurring via Adversarial Promotion Learning

Realistic Blur Synthesis for Learning Image Deblurring

Event-based Fusion for Motion Deblurring with Cross-modal Attention

Event-Guided Deblurring of Unknown Exposure Time Videos

Video Deblurring

Spatio-Temporal Deformable Attention Network for Video Deblurring

Efficient Video Deblurring Guided by Motion Magnitude

ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring

DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting

Towards Real-World Video Deblurring by Exploring Blur Formation Process

Image Decomposition

Blind Image Decomposition

Deraining – deraining

Not Just Streaks: Towards Ground Truth for Single Image Deraining

Rethinking Video Rain Streak Removal: A New Synthesis Model and a Deraining Network with Video Rain Prior

Dehazing – to fog

Frequency and Spatial Dual Guidance for Image Dehazing

Perceiving and Modeling Density for Image Dehazing

Boosting Supervised Dehazing Methods via Bi-Level Patch Reweighting

Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning

Demoireing – Go moiré

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

HDR Imaging / Multi-Exposure Image Fusion – HDR image generation / multi-exposure image fusion

Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging

Ghost-free High Dynamic Range Imaging with Context-aware Transformer

Selective TransHDR: Transformer-Based Selective HDR Imaging Using Ghost Region Mask

HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields

Towards Real-World HDRTV Reconstruction: A Data Synthesis-Based Approach

Image Fusion

FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion

Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion

Neural Image Representations for Multi-Image Fusion and Layer Separation

Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion

Frame Interpolation – frame insertion

Real-Time Intermediate Flow Estimation for Video Frame Interpolation

FILM: Frame Interpolation for Large Motion

Video Interpolation by Event-driven Anisotropic Adjustment of Optical Flow

Learning Cross-Video Neural Representations for High-Quality Frame Interpolation

Deep Bayesian Video Frame Interpolation

A Perceptual Quality Metric for Video Frame Interpolation

DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting

Spatial-Temporal Video Super-Resolution

Towards Interpretable Video Super-Resolution via Alternating Optimization

Image Enhancement – ​​image enhancement

Local Color Distributions Prior to Image Enhancement

SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement

Neural Color Operators for Sequential Image Retouching

Deep Fourier-Based Exposure Correction Network with Spatial-Frequency Interaction

Uncertainty Inspired Underwater Image Enhancement

NEST: Neural Event Stack for Event-Based Image Enhancement

Low-Light Image Enhancement

LEDNet: Joint Low-light Enhancement and Deblurring in the Dark

Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression

Image Harmonization – Image Harmonization

Harmonizer: Learning to Perform White-Box Image and Video Harmonization

DCCF: Deep Comprehensive Color Filter Learning Framework for High-Resolution Image Harmonization

Semantic-Guided Multi-Mask Image Harmonization

Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization

Image Completion/Inpainting – image restoration

Learning Prior Feature and Attention Enhanced Image Inpainting

Perceptual Artifacts Localization for Inpainting

High-Fidelity Image Inpainting with GAN Inversion

Unbiased Multi-Modality Guidance for Image Inpainting

Image Inpainting with Cascaded Modulation GAN and Object-Aware Training

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation

Diverse Image Inpainting with Normalizing Flow

Hourglass Attention Network for Image Inpainting

Perceptual Artifacts Localization for Inpainting

Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

The Surprisingly Straightforward Scene Text Removal Method with Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis

Video Inpainting

Error Compensation Framework for Flow-Guided Video Inpainting

Flow-Guided Transformer for Video Inpainting

Image Colorization – image colorization

Eliminating Gradient Conflict in Reference-based Line-art Colorization

Bridging the Domain Gap towards Generalization in Automatic Colorization

CT2: Colorization Transformer via Color Tokens

PalGAN: Image Colorization with Palette Generative Adversarial Networks

BigColor: Colorization using a Generative Color Prior for Natural Images

Semantic-Sparse Colorization Network for Deep Exemplar-Based Colorization

ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer

L-CoDer: Language-Based Colorization with Color-Object Decoupling Transformer

Colorization for In Situ Marine Plankton Images

Image Matting – image matting

TransMatting: Enhancing Transparent Objects Matting with Transformers

One-Trimap Video Matting

Shadow Removal – shadow removal

Style-Guided Shadow Removal

Image Compression – image compression

Optimizing Image Compression via Joint Learning with Denoising

Implicit Neural Representations for Image Compression

Expanded Adaptive Scaling Normalization for End to End Image Compression

Content-Oriented Learned Image Compression

Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression

Content Adaptive Latents and Decoder for Neural Image Compression

Video Compression

AlphaVC: High-Performance and Efficient Learned Video Compression

CANF-VC: Conditional Augmented Normalizing Flows for Video Compression

Neural Video Compression Using GANs for Detail Synthesis and Propagation

Image Quality Assessment – ​​image quality assessment

FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling

Shift-tolerant Perceptual Similarity Metric

Telepresence Video Quality Assessment

A Perceptual Quality Metric for Video Frame Interpolation


Deep Portrait Delighting

Geometry-Aware Single-Image Full-Body Human Relighting

NeRF for Outdoor Scene Relighting

Physically-Based Editing of Indoor Scene Lighting from a Single Image

Style Transfer – style transfer

CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer

Image-Based CLIP-Guided Essence Transfer

Learning Graph Neural Networks for Image Style Transfer

WISE: Whitebox Image Stylization by Example-based Learning

Language-Driven Artistic Style Transfer

MoDA: Map Style Transfer for Self-Supervised Domain Adaptation of Embodied Agents

JoJoGAN: One Shot Face Stylization

EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer

RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer

Image Editing – image editing

Context-Consistent Semantic Image Editing with Style-Preserved Modulation

GAN with Multivariate Disentangling for Controllable Hair Editing

Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing

High-fidelity GAN Inversion with Padding Space

Text2LIVE: Text-Driven Layered Image and Video Editing

IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion

Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment

HairNet: Hairstyle Transfer with Pose Changes

End-to-End Visual Editing with a Generatively Pre-trained Artist

The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing

Scraping Textures from Natural Images for Synthesis and Editing

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance

Editing Out-of-Domain GAN Inversion via Differential Activations

ChunkyGAN: Real Image Inversion via Segments

FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations

A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos

Rayleigh EigenDirections (REDs): Nonlinear GAN latent space traversals for multidimensional features

Image Generation/Synthesis / Image-to-Image Translation – Image Generation/Synthesis/Translation

Text-to-Image / Text Guided / Multi-Modal

TIPS: Text-Induced Pose Synthesis

TISE: A Toolbox for Text-to-Image Synthesis Evaluation

Learning Visual Styles from Audio-Visual Associations

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors

Trace Controlled Text to Image Generation

Audio-Driven Stylized Gesture Generation with Flow-Based Model

No Token Left Behind: Explainability-Aided Image Classification and Generation

Image-to-Image / Image Guided

End-to-end Graph-constrained Vectorized Floorplan Generation with Panoptic Refinement

ManiFest: Manifold Deformation for Few-shot Image Translation

VecGAN: Image-to-Image Translation with Interpretable Latent Directions

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

Cross Attention Based Style Distribution for Controllable Person Image Synthesis

Vector Quantized Image-to-Image Translation

URUST: Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization

General Object Pose Transformation Network from Unpaired Data

Unpaired Image Translation via Vector Symbolic Architectures

Supervised Attribute Information Removal and Reconstruction for Image Manipulation

Bi-Level Feature Alignment for Versatile Image Translation and Manipulation

Multi-Curve Translator for High-Resolution Photorealistic Image Translation

CoGS: Controllable Generation and Search from Sketch and Style

AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics

Others for image generation

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing

Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling

GAN Cocktail: mixing GANs without dataset access

Compositional Visual Generation with Composable Diffusion Models


StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pretrained StyleGAN

WaveGAN: An Frequency-aware GAN for High-Fidelity Few-shot Image Generation

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

Auto-regressive Image Synthesis with Integrated Quantization

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Generator Knows What Discriminator Should Learn in Unconditional GANs

Hierarchical Semantic Regularization of Latent Spaces in StyleGANs

FurryGAN: High Quality Foreground-aware Image Synthesis

Improving GANs for Long-Tailed Data through Group Spectral Regularization

Exploring Gradient-based Multi-directional Controls in GANs

Improved Masked Image Generation with Token-Critic

Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation

Any-Resolution Training for High-Resolution Image Synthesis

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-Aided Adversarial Learning

Few-Shot Image Generation with Mixup-Based Distance Learning

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleFace: Towards Identity-Disentangled Face Generation on Megapixels

Contrastive Learning for Diverse Disentangled Foreground Generation

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

DuelGAN: A Duel between Two Discriminators Stabilizes the GAN Training

Video Generation

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer

Controllable Video Generation through Global and Local Motion Dynamics

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis

Synthesizing Light Field Video from Monocular Video

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

Motion Transformer for Unsupervised Image Animation

Sound-Guided Semantic Video Generation

Layered Controllable Video Generation

Diverse Generation from a Single Video Made Possible

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

EAGAN: Efficient Two-Stage Evolutionary Architecture Search for GANs

BlobGAN: Spatially Disentangled Scene Representations


Learning Local Implicit Fourier Representation for Image Warping

Dress Code: High-Resolution Multi-Category Virtual Try-On

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

Single Stage Virtual Try-on via Deformable Attention Flows

Outpainting by Queries

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal

Efficient Meta-Tuning for Content-aware Neural Video Delivery

Human-centric Image Cropping with Partition-aware and Content-preserving Features

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis

Responsive Listening Head Generation: A Benchmark Dataset and Baseline

Contrastive Monotonic Pixel-Level Modulation

AutoTransition: Learning to Recommend Video Transition Effects

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion

Learning Object Placement via Dual-path Graph Completion

DeepMCBM: A Deep Moving-camera Background Model

Mind the Gap in Distilling StyleGANs

StyleSwap: Style-Based Generator Empowers Robust Face Swapping

Geometric Representation Learning for Document Image Rectification

Studying Bias in GANs through the Lens of Race

On the Robustness of Quality Measures for GANs

TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation


Image Restoration

Unsupervised Underwater Image Restoration: From a Homology Perspective

Panini-Net: GAN Prior based Degradation-Aware Feature Interpolation for Face Restoration

Burst Restoration

Zero-Shot Multi-Frame Image Restoration with Pre-Trained Siamese Transformers

Video Restoration

Transcoded Video Restoration by Temporal Spatial Auxiliary Network

Super Resolution

Image Super Resolution

SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution

Efficient Non-Local Contrastive Attention for Image Super-Resolution

Best-Buddy GANs for Highly Detailed Image Super-Resolution

Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-Based Super-Resolution

Detail-Preserving Transformer for Light Field Image Super-Resolution

Denoising – denoising

Image Denoising

Generative Adaptive Convolutions for Real-World Noisy Image Denoising

Video Denoising

ReMoNet: Recurrent Multi-Output Network for Efficient Video Denoising

Deblurring – Deblurring

Video Deblurring

Deep Recurrent Neural Network with Multi-Scale Bi-Directional Propagation for Video Deblurring

Deraining – deraining

Online-Updated High-Order Collaborative Networks for Single Image Deraining

Close the Loop: A Unified Bottom-up and Top-down Paradigm for Joint Image Deraining and Segmentation

Dehazing – to fog

Uncertainty-Driven Dehazing Network

Demosaicing – Demosaicing

Deep Spatial Adaptive Network for Real Image Demosaicing

HDR Imaging / Multi-Exposure Image Fusion – HDR image generation / multi-exposure image fusion

TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework Using Self-Supervised Multi-Task Learning

Image Enhancement – ​​image enhancement

Low-Light Image Enhancement

Low-Light Image Enhancement with Normalizing Flow

Degrade is Upgrade: Learning Degradation for Low-light Image Enhancement

Semantically Contrastive Learning for Low-Light Image Enhancement

Image Matting – image matting

MODNet: Trimap-Free Portrait Matting in Real Time

Shadow Removal – shadow removal

Efficient Model-Driven Network for Shadow Removal

Image Compression – image compression

Towards End-to-End Image Compression and Analysis with Transformers

OoDHDR-Codec: Out-of-Distribution Generalization for HDR Image Compression

Two-Stage Octave Residual Network for End-to-End Image Compression

Image Quality Assessment – ​​image quality assessment

Content-Variant Reference Image Quality Assessment via Knowledge Distillation

Perceptual Quality Assessment of Omnidirectional Images

Style Transfer – style transfer

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Deep Translation Prior: Test-Time Training for Photorealistic Style Transfer

Image Editing – image editing

Image Generation/Synthesis / Image-to-Image Translation – Image Generation/Synthesis/Translation

SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

Assessing a Single Image in Reference-Guided Image Synthesis

Interactive Image Generation with Natural-Language Feedback

PetsGAN: Rethinking Priors for Single Image Generation

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Hierarchical Image Generation via Transformer-Based Sequential Patch Selection

Style-Guided and Disentangled Representation for Robust Image-to-Image Translation

OA-FSUI2IT: A Novel Few-Shot Cross Domain Object Detection Framework with Object-Aware Few-shot Unsupervised Image-to-Image Translation

Video Generation

Learning Temporally and Semantically Consistent Unpaired Video-to-Video Translation through Pseudo-Supervision from Synthetic Optical Flow


What are low-level and high-level tasks_low-level tasks_WTHunt's Blog-CSDN Blog

What is the prospect of low-level vision in the CV field? – Zhihu (zhihu.com)

GitHub – DarrenPan/Awesome-CVPR2023-Low-Level-Vision: A Collection of Papers and Codes in CVPR2023/2022 about low level vision