/computerVision

CV Models/Algorithms Implementations

Primary LanguageJupyter Notebook

Image Processing Library

Functionality Done
Image I/O
Read image from file
Write image to file
Image Data Operations
Copy image data
Binarize image data using a specified threshold
Iteratively binarize image data
Increase brightness of an image
Decrease brightness of an image
Histogram Operations
Compute image histogram
Compute and save image histogram to a file
Equalize image histogram
Affine Transformations
Image Rotation
Image Grey Inverse
Image Scaling
Image Reflection
Image Shearing
Filters
Image Mean-Blur Filter
Image Sepia Filter
Image Line Detection

Models Implementations

Model Name Done
Convolutional Neural Networks (CNNs):
LeNet
AlexNet
VGG (VGG16, VGG19)
GoogleNet (Inception)
ResNet (ResNet50, ResNet101)
Xception
DenseNet
EfficientNet
MobileNet
ShuffleNet
SENet (Squeeze and Excitation Network)
SqueezeNet
RegNet
ConvNeXt
MixNet
Object Detection:
YOLO (You Only Look Once)
SSD (Single Shot MultiBox Detector)
R-CNN (Regions with CNN features)
Faster R-CNN
Mask R-CNN
RetinaNet
CenterNet
EfficientDet
YOLOv4, YOLOv5
DETR (Detection Transformer)
FCOS (Fully Convolutional One-Stage Object Detection)
Pose Estimation:
2D Pose Estimation (Heatmap-based methods)
Part-based models
Pictorial structures models
3D Pose Estimation (PoseNet)
TPOT (Tree-based Pipeline Optimization Tool)
OpenPose
AlphaPose
DensePose
HRNet (High-Resolution Network for Pose Estimation)
Image Segmentation:
U-Net
DeepLab
DeepLabV3+
Fully Convolutional Networks (FCNs)
Mask R-CNN
PSPNet (Pyramid Scene Parsing Network)
SegNet
BiSeNet (Bilateral Segmentation Network)
HRNet for Segmentation
Generative Algorithms and Network Architectures:
Variational Autoencoders (VAEs)
Pixel Recurrent Neural Networks (PixelRNN)
PixelCNN
BigGAN
StyleGAN2
CycleGAN
SRGAN (Super-Resolution GAN)
Generative Adversarial Networks (GANs):
Conditional GANs (CGANs)
Wasserstein GANs (WGANs)
StyleGAN
ProGAN (Progressive Growing of GANs)
StarGAN (Multi-Domain Image-to-Image Translation)
Pix2Pix
Image Classification:
Capsule Networks
ResNeXt
Neural Architecture Search (NASNet)
Vision Transformers (ViT)
Swin Transformer
MLP-Mixer
ConvMixer
Hybrid Vision Transformers (Swin + CNNs)
Image Denoising:
Non-Local Means Denoising
Denoising Autoencoders (DAE)
BM3D (Block-Matching and 3D Filtering)
Image Super-Resolution:
Generative Adversarial Networks (GANs)
SRResNet (Super-Resolution ResNet)
Real-ESRGAN (Enhanced Super-Resolution GAN)
Deep Reinforcement Learning for Vision Tasks:
Deep Q-Networks (DQNs)
Optical Flow Estimation:
Farneback algorithm
PWC-Net (Pyramid, Warping, and Cost volume Network)
LiteFlowNet
RAFT (Recurrent All-Pairs Field Transforms)
Visual Object Tracking (VOT):
Kernelized Correlation Filters (KCF)
GOTURN (Generic Object Tracking Using Regression Networks)
SiamRPN (Siamese Region Proposal Network)
ByteTrack
Action Recognition:
Two-Stream Networks
3D Convolutional Neural Networks (3D CNNs)
I3D (Inflated 3D ConvNet)
SlowFast Networks
TSM (Temporal Shift Module)
3D Vision:
Structure from Motion (SfM)
Multi-View Stereo (MVS)
NeRF (Neural Radiance Fields)
PointNet
MeshCNN
Volumetric CNNs
Other Frameworks and Tools:
Darknet
VGGFace2
OpenVINO
Detectron2
MMDetection
mmdetection3d
OpenPose