Papers

Group Normalization -Kaiming He, et al, arxiv2018
Graph Convolutional Network -Xiaolong Wang, Yufei Ye, Abhinav Gupta, CVPR2018
DetNAS: Backbone Search for Object Detection
Mixup

light network

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning -ICLR2022,code
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications -arxiv2022, code
EfficientFormer: Vision Transformers at MobileNet Speed -apple, arxiv2022, code
UNeXt: MLP-based Rapid Medical Image Segmentation Network -arxiv2022, code
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation -tencent, CVPR2022, code
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer apple, ICLR2022, code
TinyNetModel Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets -huawei, NeurIPS2020
GhostNet: More Features from Cheap Operations -huawei, CVPR2020
EfficientNet
SqueezeNet
Mobilenets -google, arxiv2017
MobileNet-V2 -google, CVPR2018 caffe-code
MobileNetV3
NasNet-A-Learning transferable architectures for scalable image recognition -google brain, CoRR2017
ShuffleNet -megvii, CoRR2017
ShuffleNetV2
ThunderNet
DarkNet/Tiny YOLOv3/Tiny YOLOv2/Yolo-Nano/SlimYOLO/YOLO-LITE/Gaussian YOLOv3
LightweightNet: Toward fast and lightweight convolutional neural networks via architecture distillation -XuTingbin, PR2019
Mobilefacenets
EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse
Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution
HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs
Joint Architecture and Knowledge Distillation in Convolutional Neural Network for Offline Handwritten Chinese Text Recognition -dujun, arxiv2019 Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition -huoqiang, PR2019 vovnet
http://openaccess.thecvf.com/content_CVPRW_2019/papers/CEFRL/Lee_An_Energy_and_GPU-Computation_Efficient_Backbone_Network_for_Real-Time_Object_CVPRW_2019_paper.pdf

network

Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios -bytedance, arxiv2022
TRT-ViT: TensorRT-oriented Vision Transformer -bytedance, arxiv2022

model compression

蒸馏：teacher-student/mutual-learning/Self-Distillation
张量分解：low-rank/SVD-decomposition/Tucker-decomposition/CP-decomposition
剪枝
量化
编码

InformationExtraction

knowledge distillation

Decoupled Knowledge Distillation -megvii, CVPR2022, code
Efficient knowledge distillation for rnn-transducer models -google/facebook, ICASSP2021
Investigation of Sequence-level Knowledge Distillation Methods for CTC Acoustic Models -NICT japan, ICASSP2019
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation -IBM, Interspeech2019
Explaining sequence-level knowledge distillation as data-augmentation for neural machine translation -arxiv2019
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion -microsoft, Interspeech2019
Knowledge Distillation for Sequence Model -AISpeech, Interspeech2018
Improved knowledge distillation from bi-directional to uni-directional LSTM CTC for end-to-end speech recognition -IBM, SLT2018
An Investigation of a Knowledge Distillation Method for CTC Acoustic Models -NICT japan, ICASSP2018
Sequence-Level Knowledge Distillation -Yoon Kim, EMNLP2016

Document Rectification

Fourier Document Restoration for Robust Document Dewarping and Recognition -CVPR2022, bai song database
Document Dewarping with Control Points -ICDAR2021, code&dataset
Document Rectification and Illumination Correction using a Patch-based CNN -SIGGRAPH2019, code

Graph

Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks -PR2021
A Comprehensive Survey on Graph Neural Networks -TNN2020
Contextual Stroke Classification in Online Handwritten Documents with Edge Graph Attention Networks -SNCS2020
Deepgcns: Can gcns go as deep as cnns? -ICCV2019
Heterogeneous graph attention network -WWW2019
Contextual Stroke Classification in Online Handwritten Documents with Graph Attention Networks -ICDAR2019
Graph Convolutional Networks for Text Classification -AAAI2019
Graph Attention Networks -ICLR2018
Semi-Supervised Classification with Graph Convolutional Networks -ICLR2017

super resolution

Real-esrgan: Training real-world blind super-resolution with pure synthetic data -tencent, ICCV2021, code
Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices -alibaba, ACMMM2021code
SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices -arxiv2021, code
Extremely Lightweight Quantization Robust Real-Time Single-Image Super Resolution for Mobile Devices -CVPR2021, code

deblur

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing -TCL, ECCV2022, database/code
Global-Local Stepwise Generative Network for Ultra High-Resolution Image Restoration -arxiv2022
A Survey on Deep learning based Document Image Enhancement -arxiv2021
NTIRE 2021 challenge for defocus deblurring using dual-pixel images: Methods and results -CVPR2021, code
Multi-Stage Progressive Image Restoration -google, CVPR2021, code
Learning frequency domain priors for image demoireing -PAMI2021, code
Morié Attack (MA): A New Potential Risk of Screen Photos -NIPs2021, code
Image demoireing with learnable bandpass filters -CVPR2020, code
WDNet: Watermark-Decomposition Network for Visible Watermark Removal -baixiang, WACV2021, database/code
High Resolution Demoire Network -ICIP2020, code
BEDSR-Net: A Deep Shadow Removal Network From a Single Document Image -CVPR2020, code

Rechargeablezz/papers

Papers

layout

asr

Contextual Biasing

table detection & recognition

mathematical expression recognition

word_vector

Seq2Seq

ReID

PoseEstimation

EdgeDetection

line segmentation

video_classification

dnn_base