Convolutional Neural Networks (CNNs): |
|
LeNet |
✅ |
AlexNet |
✅ |
VGG (VGG16, VGG19) |
|
GoogleNet (Inception) |
|
ResNet (ResNet50, ResNet101) |
|
Xception |
|
DenseNet |
|
EfficientNet |
|
MobileNet |
|
ShuffleNet |
|
SENet (Squeeze and Excitation Network) |
|
SqueezeNet |
|
RegNet |
|
ConvNeXt |
|
MixNet |
|
|
|
Object Detection: |
|
YOLO (You Only Look Once) |
|
SSD (Single Shot MultiBox Detector) |
|
R-CNN (Regions with CNN features) |
|
Faster R-CNN |
|
Mask R-CNN |
|
RetinaNet |
|
CenterNet |
|
EfficientDet |
|
YOLOv4, YOLOv5 |
|
DETR (Detection Transformer) |
|
FCOS (Fully Convolutional One-Stage Object Detection) |
|
|
|
Pose Estimation: |
|
2D Pose Estimation (Heatmap-based methods) |
|
Part-based models |
|
Pictorial structures models |
|
3D Pose Estimation (PoseNet) |
|
TPOT (Tree-based Pipeline Optimization Tool) |
|
OpenPose |
|
AlphaPose |
|
DensePose |
|
HRNet (High-Resolution Network for Pose Estimation) |
|
|
|
Image Segmentation: |
|
U-Net |
|
DeepLab |
|
DeepLabV3+ |
|
Fully Convolutional Networks (FCNs) |
|
Mask R-CNN |
|
PSPNet (Pyramid Scene Parsing Network) |
|
SegNet |
|
BiSeNet (Bilateral Segmentation Network) |
|
HRNet for Segmentation |
|
|
|
Generative Algorithms and Network Architectures: |
|
Variational Autoencoders (VAEs) |
|
Pixel Recurrent Neural Networks (PixelRNN) |
|
PixelCNN |
|
BigGAN |
|
StyleGAN2 |
|
CycleGAN |
|
SRGAN (Super-Resolution GAN) |
|
|
|
Generative Adversarial Networks (GANs): |
|
Conditional GANs (CGANs) |
|
Wasserstein GANs (WGANs) |
|
StyleGAN |
|
ProGAN (Progressive Growing of GANs) |
|
StarGAN (Multi-Domain Image-to-Image Translation) |
|
Pix2Pix |
|
|
|
Image Classification: |
|
Capsule Networks |
|
ResNeXt |
|
Neural Architecture Search (NASNet) |
|
Vision Transformers (ViT) |
|
Swin Transformer |
|
MLP-Mixer |
|
ConvMixer |
|
Hybrid Vision Transformers (Swin + CNNs) |
|
|
|
Image Denoising: |
|
Non-Local Means Denoising |
|
Denoising Autoencoders (DAE) |
|
BM3D (Block-Matching and 3D Filtering) |
|
|
|
Image Super-Resolution: |
|
Generative Adversarial Networks (GANs) |
|
SRResNet (Super-Resolution ResNet) |
|
Real-ESRGAN (Enhanced Super-Resolution GAN) |
|
|
|
Deep Reinforcement Learning for Vision Tasks: |
|
Deep Q-Networks (DQNs) |
|
|
|
Optical Flow Estimation: |
|
Farneback algorithm |
|
PWC-Net (Pyramid, Warping, and Cost volume Network) |
|
LiteFlowNet |
|
RAFT (Recurrent All-Pairs Field Transforms) |
|
|
|
Visual Object Tracking (VOT): |
|
Kernelized Correlation Filters (KCF) |
|
GOTURN (Generic Object Tracking Using Regression Networks) |
|
SiamRPN (Siamese Region Proposal Network) |
|
ByteTrack |
|
|
|
Action Recognition: |
|
Two-Stream Networks |
|
3D Convolutional Neural Networks (3D CNNs) |
|
I3D (Inflated 3D ConvNet) |
|
SlowFast Networks |
|
TSM (Temporal Shift Module) |
|
|
|
3D Vision: |
|
Structure from Motion (SfM) |
|
Multi-View Stereo (MVS) |
|
NeRF (Neural Radiance Fields) |
|
PointNet |
|
MeshCNN |
|
Volumetric CNNs |
|
|
|
Other Frameworks and Tools: |
|
Darknet |
|
VGGFace2 |
|
OpenVINO |
|
Detectron2 |
|
MMDetection |
|
mmdetection3d |
|
OpenPose |
|