/CS433-road-segmentation

Project 2 from the CS-433 Machine Learning course taken at EPFL.

Primary LanguageJupyter Notebook

Road Segmentation in Satellite Images

Overview

In the evolving landscape of digital image processing and computer vision, road segmentation from satellite images stands as a challenging domain. Road segmentation is a process used in computer vision and image processing where the goal is to identify and isolate the parts of an image that represent roads. This project dives deep into the process of segmenting roads from satellite images. We leverage state-of-the-art architectures such as U-Net and DeepLabV3 for effective image segmentation.

Models

DeepLabV3

UNet(s)

Datasets

  1. AIcrowd Dataset: High-resolution satellite images with labeled roads.
  2. Massachusetts Roads Dataset: 1500x1500 pixel images, segmented into smaller parts. See notebook for preprocessing this dataset for our needs. Link to original dataset.
  3. Kaggle Dataset: 400x400 pixel images from Los Angeles, filtered for road presence. The original dataset was downloaded using Googlemaps API. See notebook for preprocessing this dataset for our needs. Link to original dataset.

Download Data

  • Downloading these datasets is optional, they are not needed for the best submission, however if the user wants to reproduce the training process they are required. For ease of use, we provide a link so you can download the datasets already preprocessed. Download Data

Installation

Note: This guide is for users who have anaconda or miniconda installed. If you are using a different tool for managing environments such as venv then skip steps [2-4] and create the environment following the appropriate guideline.

  1. Clone the repository:
    git clone ...
    cd ...
  2. Create an environment using Python 3.8.18
    conda create --name road_segmentation python==3.8.18
  3. Activate the environment
    conda activate road_segmentation
  4. Install the required packages:
    pip install -r requirements.txt
  5. After installation the folder structure should look like this:
├── config
├── datasets
│   ├── kaggle (optional)
│   ├── massachusetts_384 (optional)
│   ├── test
│   ├── train
│   └── validation
├── examples
│   └── baseline_model
├── models
│   ├── checkpoints
├── notebooks
└── predictions

Usage

  • Training: Training the best model takes 12 hours to run on a NVIDIA GeForce RTX 3050 Ti (laptop version). The additional datasets (preprocessed kaggle and massachusetts) will be downloaded automatically on the first execution of the following script. To reproduce the best model checkpoint execute:
    pyton training_pipeline.py
  • Testing: For ease of use, we provide the best models' checkpoint which will be downloaded from Dropbox when executing the run script. To reproduce the best result execute:
    python run.py
    

Additional

  • Configuration File: Contains settings for model parameters, training settings, and data paths. config.yaml
  • Postprocessing: Contains postprocessing functions. postprocessing.py
  • Utils: Utility functions for training and evaluation. train_utils.py, utils.py
  • Baseline: Modified tf_aerial_images.py, which demonstrates the use of a basic convolutional neural network in TensorFlow for generating a baseline. See tf_aerial_images.py. In order to run this script you need to install tensorflow==2.11.0. To avoid environment conflicts we recommend you to create a new environment and install this dependency separately.