/VisionDash

Official submission to SRM Research Day 2022: Vision Dashboard - A One-Stop CV Learning Tool

Primary LanguagePythonMIT LicenseMIT

VisionDash - A One-Stop CV Learning Tool

Computer Vision (CV) is a growing field that attracts many beginners in the field of Machine Learning. According to research, visual information is mapped better in students’ minds and helps them retain information for a longer duration. However, the traditional educational methodology involves teaching theoretical concepts utilizing text-based explanations and audio. This results in most students not being able to visualize or understand the significant CV techniques, and thus students are unsure about how to approach CV as a field. In addition, CV models tend to be computationally heavy, expensive, and difficult to run from a beginner’s point of view, which discourages students from pursuing the field seriously. This paper presents a method of demonstrating CV algorithms using a Vision Dashboard, keeping the aforementioned issues in mind.

Our approach allows students to run various CV methods on any image compiled on a single dashboard. This helps students visualize techniques like Object Detection, Instance Segmentation, Semantic Segmentation, Style Transfer, Image Classification, Super Resolution, Denoising, Image generation using GANs, and Face Detection efficiently, serving as an effective teaching tool.

Implementation Details

Our project, VisionDash, consists of a dashboard providing the users with an option to overview various CV tasks, learn about different CV techniques utilizing the resources provided on the dashboard and supplement their knowledge with the SOTA implementations of each algorithm.

The algorithms provided are divided into broad categories: Image Classification, Detection, Segmentation, Denoising, Generative Adversarial Networks, and Style Transfer. These categories are further divided into the specific tasks of Image Classification, Object Detection and Face Detection, Instance Segmentation and Semantic Segmentation, Noise2Noise, Super Resolution GAN, and Fast Style Transfer.

Task Model Source
Image Classifiation EfficientNetb7 Torchvision
Face Detection MTCNN Facenet-Pytorch
Object Detection RetinaNet Torchvision
Super Resolution SRGANs Open-source
Denoising Noise2Noise Open Source
Semantic Segmentation DeepLabV3 Torchvision
Instance Segmentation Mask RCNN Torchvision
Style Transfer Fast Style Transfer Open Souce

These models have been integrated into a single dashboard built using Streamlit which turns data scripts into shareable web apps. Different widgets have been used to make the user interface as interactive and visually impactful as possible. As a part of VisionDash, we provide resources to serve as a self-study tool for each CV technique implemented. This knowledge can be further augmented by the FAQs provided for each section.

Usage and Results

This app is deployed on Streamlit. Check out the demo at https://share.streamlit.io/sashrika15/visiondash/main/main.py

Local Installation

The application may be run locally on any compatible platform.

  • Cloning the Repository:

      git clone https://github.com/srijarkoroy/VisionDash.git
    
  • Entering the directory:

      cd VisionDash
    
  • Setting up the Python Environment with Dependencies:

      pip install virtualenv
      python -m venv env
      source env/bin/activate
      pip install -r requirements.txt
    
  • Running the Application

      streamlit run main.py
    
Dashboard Component Image
Home Page Home
Resources Resources
VisionDash in Action Visual
FAQs Section FAQs

Contributors