Advanced Computer Vision

This repository is dedicated to projects and some theoretical material that I used to get into topics of Computer Vision (CV) in a practical/efficient way.

Index:

What are the topics covered?
Courses
Projects
Relevant/Extra_repositories

Some_of_the_topics_covered:

Image Classification
Object Detection/Location
Image Segmentation
Transfer Learning
Vizualization and Interpretability
Vision Transformers & Vision Language Models (VLMS)
Generative Models & Synthetic Data
3D Computer Vision
Zero-shot Computer Vision

Find my notes: Part I, Part II

Courses:

AI for Medical Diagnosis (DeepLearning.AI)

AI is transforming the practice of medicine. It’s helping doctors diagnose patients more accurately, make predictions about patients’ future health, and recommend better treatments. But how can AI be applied to medical imaging to diagnose diseases?

This course offered:

Nuances of working with both 2D and 3D medical image data, for multi-class classification and image segmentation.
Practical/theorical material of how to classify diseases in x-ray images and segment tumors in 3D MRI brain images.
How to properly evaluate the performance of your models.

Course Certificate: Link ; More Info

Advanced Computer Vision with TensorFlow (DeepLearning.AI & TensorFlow)

Explore image classification, image segmentation, object localization, and object detection. Apply transfer learning to object localization and detection.
Object detection models such as regional-CNN and ResNet-50, customize existing models, and build your own models to detect, localize, and label images.
Implement image segmentation using variations of the fully convolutional network (FCN) including U-Net and Mask-RCNN to identify and detect numbers, pets, zombies, and more.
Identify which parts of an image are being used by your model to make its predictions using class activation maps and saliency maps and apply these ML interpretation methods to inspect and improve the design of a famous network, AlexNet.

Course Certificate: Link ; More Info

Prompt Vision Models (DeepLearning.AI & Comet)

Image Generation: Generate images from text prompts using Stable Diffusion, adjusting hyperparameters (strength & guidance scale) for precise control over the outputs.
Image Segmentation: With Meta’s SAM by prompting with coordinates and bounding boxes to accurately identify and separate objects within images.
Object Detection: OWL-ViT for zero-shot object detection, prompting with natural language to detect specific objects and generate bounding boxes for precise isolation.
In-painting: Combine image generation, segmentation, and detection techniques to replace or add objects within images seamlessly, ensuring smooth integration with existing content.
Personalization (w Fine-tuning): DreamBooth to fine-tune diffusion models, associating text labels with specific objects to generate custom images based on provided pictures for personalized outputs.

Course Certificate: Link ; More Info

Computer Vision Course (by Hugging Face)

This course delves into the fundamentals of computer vision, covering essential topics such as image processing, convolutional neural networks, and vision transformers.
It explores advanced concepts like multimodal models, vision-language models, and generative models, with a focus on both 2D and 3D computer vision tasks.
Addresses emerging topics like model optimization, synthetic data, and zero-shot computer vision.

Course: More Info

Disclaimer

Copyright of all materials in thoses courses belongs to DeepLearning.AI, TensorFlow and HuggingFace and can only be used or distributed for educational purpose. You may not use or distribute them for commercial purposes.

Projects:

Here are links to my Computer Vision & Image Processing projects:

Face Img/Video based drowsiness recognition
ML\DL model for detecting drowsiness recognition based on facial image/video.
Automated Cell Counting
Automate cell counting in microscopy images.
ML (company) assessment
(Private repo) Automate reading of total value of receipts by OCR, automatically select/extract region of total value over all possible numbers/regions, retrieve result in correct format.
Brain Tumor Diagnosis App
Brain tumor diagnostic app developed with Gradio. ViT fine-tuned for binary classification of brain scans.

Relevant/Extra_repositories

With the aim of deepening my knowledge on topics that interest me most and that are more complex and require deeper knowledge to understand and master, I created additional repositories with notes and enthusiastic projects.

Generative_AI

AMfeta99/Advanced_Computer_Vision

Advanced Computer Vision

Index:

Some_of_the_topics_covered:

Courses:

AI for Medical Diagnosis (DeepLearning.AI)

Course Certificate: Link ; More Info

Advanced Computer Vision with TensorFlow (DeepLearning.AI & TensorFlow)

Course Certificate: Link ; More Info

Prompt Vision Models (DeepLearning.AI & Comet)

Course Certificate: Link ; More Info

Computer Vision Course (by Hugging Face)

Course: More Info

Disclaimer

Projects:

Face Img/Video based drowsiness recognition

Automated Cell Counting

ML (company) assessment

Brain Tumor Diagnosis App

Relevant/Extra_repositories

Generative_AI