/COMP4423

Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University

Primary LanguageJupyter Notebook

Computer Vision for Beginners - COMP4423 @ PolyU HK

The Lectures and Tutorials


L1 Introduction to Computer Vision

  • What is Computer Vision?
  • Applications (object detection, semantic segmentation, style transfer, etc.)
  • A brief history of Computer Vision
  • Play with FPV Recognition

Lecture Slides: L1-Introduction.pdf

Video Link: https://youtu.be/sWwWroRpqkM?si=V3FSwlet643YTDSU

Tutorial Environment Setup: T1-Get Environment Ready


L2 Image Processing I: Let's play with the images

  • How Human/Computers see images
  • Display the images
  • Play with the images (colors, sizes, rotations)
  • Examples from IMHere

Lecture Slides: L2-Image.Processing.I.pdf

Video Link: https://youtu.be/scrAoh-L7KU?si=w2AmQ0Pl4AAgBoJd

Tutorial Tasks (Google CoLab): T2-Play.with.images-tasks.ipynb

Tutorial Answers (Google CoLab): T2-Play.with.images-answers.ipynb

Image Lenna: T2-lenna.png


L3 Image Processing II: Let's play with the content

  • Filters and convolutions
  • Edge Filters
  • Nose Reduction
  • Morphological Operations

Slides: L3-Image.Processing.II.pdf

Video Link: https://youtu.be/UVGG4ZFQWrw?si=DkQj4y8ppGYacYxO

Tutorial Tasks (Google CoLab): T3-Play.with.content-tasks.ipynb

Tutorial Answers (Google CoLab): T3-Play.with.content-answers.ipynb

Challenge Tasks (Google CoLab): T3-Play.with.content-challenge.ipynb

Virus Image: T1-coronvirus-mask.png

Image Lenna: T2-lenna.png


L4 Featrue Extraction

  • Feature vectors
  • Feature Space
  • Quantization
  • Metrics (Distance and Similarity)
  • Global and Local Features (Color Histograms, LBP, SIFT)

Lecture Slides: L4-Feature.Extraction.pdf

Video Link: https://youtu.be/7UUWyQiCtfU?si=mbCBjrJLwoi6kXhO

Demo: Keypoint extraction and tracking

Demo 2: Keypoint extraction and tracking

Tutorial Tasks (Google CoLab): T4-Feature_extraction_task

Tutorial Answers (Google CoLab): T4-Feature_extraction_answers


L5 Image Retrieval Fundamentals

  • Clustering
  • K-Means
  • Content-based image retrieval (CBIR)
  • Bag of Visual Words (BoVW)

Lecture Slides: L5-Image.Retrieval.pdf

Video Link:https://youtu.be/VtCf9HCqAEw?si=a-7A9YHesKOWu49g

Tutorial Tasks (Google CoLab): T5-Image.retrieval-tasks.ipynb

Sample Code for tone modifier challange:


L6 Image Classification Fundamentals

  • Classification
  • Supervised learning
  • K nearest neighbors (k-NN)
  • Bayesian classifiers
  • Support vector machines (SVM)

Lecture Slides: L6-Image.Classification.pdf

Video Link: https://youtu.be/bUwGY5sqZHU?si=GSxOPDWWQaSr0dw9

Paper Rock Scissors Game Demo: https://youtu.be/dGwou6Khvqo?si=zoMzRBObLU9FUXZr

Tutorial Tasks: T6-Image-Classification

Challenges: T6-Challenges


L7 Traditional Machine Learning to Deep Learning

  • Traditional machine learning vs. deep learning
  • Gradient decent
  • Neural networks
  • Deep neural networks
  • Convolutional neural networks (CNN)
  • Layers, pooling, and activations
  • AlexNet, VGG, and ResNet

Lecture Slides: L7-Machine.learning.Deep.learning.pdf

Video Link: https://youtu.be/xc5MKb8LNBo?si=MlCAFszzgy001A3e

Tutorial Tasks (Google CoLab): T7-Machine.learning.Deep.learning-tasks.ipynb

Tutorial Data: T7-data.zip


L8 Deep Image Retrieval

  • Deep image retrieval
  • Feature aggregation/embedding/fusion
  • Fine tuning (Siamese/Triplet networks)
  • R-Mac, VLAD, BoVW

Lecture Slides: L8-Deep.image.retrieval.pdf

Video Link: https://youtu.be/klu6SHHoC2E?si=5vCc6-mbt-VzCOlN

Tutorial Answers (Google CoLab): T8-Deep.image.retrieval-answers.ipynb

Tutorial Data: T8-data.zip

Pytorch - Quick Start: T8-Pytorch-Quick-Start.ipynb


L9 CAM, Attentions and Transformers

  • Class Activation Mapping (CAM)
  • Attentions
  • Self-Attentions, and Transformers

Lecture Slides: L9-CAM.Attention.Transformer.pdf

Video Link: https://youtu.be/Ypi4F7nt2u4?si=9FDTkpZw3UIjwdvz

Tutorial Answers: T9-CAM and ViT


L10 Detection & Segmentation

  • Object detection and Image Segmentation
  • Yolo
  • UNet,
  • R-CNN, Fast-RCNN, Faster-RCNN, Mask-RCNN

Lecture Slides: L10-Detection.Segmentation.pdf

Video Link: https://youtu.be/gdDDQtcttZA?si=LgCJqo5hs1vuT7Bg

Tutorial Answers (Google CoLab): T10-Detection.Segmentation-answers.ipynb

Tutorial Data: T10-Images


L11 Learning Paradigms

  • Multi-task learning
  • N-shot learning (Few-shot, Zero-shot)
  • Transfer learning, Metric learning, Meta-learning
  • Generative networks (VAE, GAN)
  • Reinforcement learning

Lecture Slide: L11-Learning.Paradigms.pdf

Video Link: https://youtu.be/_jyfvaiB4g4

Tutorial RNN: T11-RNN.ipynb

Tutorial Slides: T11-RNN-and-Network-Debug


L12 Large Models

  • RNN and Image Captioning
  • Transformers
  • Large Language Models

Lecture Slide: L12-Large.Models.pdf

Appendix: Image-Synthesis