/AI-Papers

AI paper reviews in Korean

Primary LanguagePython

Vision
To learn image super-resolution, use a GAN to learn how to do image degradation first
Feature Perceptual Loss for Variational Autoencoder - Autoencoder
- Loss function
Context Encoder Context Encoders: Feature Learning by Inpainting - Self-supervised vision representation learning
- Image inpainting
Fixing the train-test resolution discrepancy
GANs Generative Adversarial Nets - GANs
ImageGPT Generative Pretraining from Pixels - Self-supervised vision representation learning
Deformable ConvNets v2: More Deformable, Better Results - CNN
Deformable Convolutional Networks - CNN
2023 ControlNet Adding Conditional Control to Text-to-Image Diffusion Models - Transformer
- Diffusion
BEIT BEIT: BERT Pre-Training of Image Transformers - Self-supervised vision representation learning
Diffusion Illusion Diffusion Illusions: Hiding Images in Plain Sight - Diffusion
- Illusion
LVDM Latent Video Diffusion Models for High-Fidelity Long Video Generation - VSR-Diffusion
Understanding Deformable Alignment in Video Super-Resolution - VSR-
- Deformable convolution
Towards Accurate Generative Models of Video: A New Metric & Challenges - Metric
Vision Image Generation
VQ-VAE-2 Generating Diverse High-Fidelity Images with VQ-VAE-2 - Image generation
- GANs
VQGAN Taming Transformers for High-Resolution Image Synthesis - Image generation
- GANs
CDM Cascaded Diffusion Models for High Fidelity Image Generation - Image generation
Consistency Models - Image generation
DiffiT DiffiT: Diffusion Vision Transformers for Image Generation - Image generation
- Transformer
Emu Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack - Image generation
Vision SR
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach - ISR
Video LDM Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models - Video generation
SwinIR: Image Restoration Using Swin Transformer - ISR
- Transformer
Blind Super-Resolution Kernel Estimation using an Internal-GAN - ISR
- GANs
BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond - VSR
BasicVSR++ BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment - VSR
SR3 Image Super-Resolution via Iterative Refinement - ISR
- Diffusion
SR3+ Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild - ISR
- Diffusion
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution - BISR
DiffBIR DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior - BISR
MoESR Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach - ISR
- Diffusion
LIIF Learning Continuous Image Representation with Local Implicit Image Function - Continuous super-resolution
Implicit Diffusion Models for Continuous Super-Resolution - Continuous super-resolution
Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion
Model and Implicit Neural Decoder - Continuous super-resolution
Vision-Language
2022 Flamingo Flamingo: a Visual Language Model for Few-Shot Learning - Transformer
VideoGPT VideoGPT: Video Generation using VQ-VAE and Transformers
Video Diffusion Models
Vision-Language Text-to-Image Generation
Dall-E 3 Improving Image Generation with Better Captions - Text-to-image generation
FIFO FIFO-Diffusion: Generating Infinite Videos from Text without Training - Text-to-image generation