Prediction of slum regions in the satellite images using image segmentation models

This repository contains various neural network architectures built using PyTorch to predict slum regions from satellite images captured via Google Earth Pro. The annotation process was conducted using the Make Sense.AI tool, with the annotated data saved in COCO JSON format. The training masks were generated from the annotation files using the COCO class from the pycocotools.coco module.

Dataset

Satellite images of slums listed in Listofslumcluster2015

Training Dataset: 2,270 RGB images of 1080x1920 resolution, resized to 128x128 RGB due to GPU VRAM constraints.
Validation Dataset: 568 RGB images.
Test Dataset: 710 RGB images.

Models

Custom UNet Model: A UNet-based model implemented from scratch.
Transfer Learning with Pretrained Models: Leveraged pretrained models from image classification tasks to improve encoder performance by incorporating their weights into the custom UNet architecture.
Fine-Tuned Pretrained Segmentation Models: Experimented with several pretrained semantic segmentation models and fine-tuned their final layers to adapt to the custom dataset.

Evaluation

The architecture of the models and weights can be visualized by uploading the model.onnx files into Netron.
Evaluation metrics for the models can be found in the evaluation folder.

Web Application

This web application, built with Django, enables users to upload satellite images of a city region and receive a segmented mask highlighting the slum areas. The mask is generated based on the uploaded image and is available for download in PNG format, preserving the original image's resolution.

You can access the web application via the following link:
Segmentation Web App

Future work:

Training larger models with more parameters using superior GPUs like NVIDIA A100 to furthur improve accuracy.