/geoai-immap

Mapping Informal Settlements in Colombia using Machine Learning

Primary LanguageJupyter NotebookMIT LicenseMIT

Setup | Usage | Code Organization | Data | Acknowledgements

Detection of Rapidly Growing Informal Settlements

This repository accompanies our research work for Informal Settlement Detection in Northern Colombia.

The goal of this project is to provide a means for faster, cheaper, and more scalable detection of rapidly growing informal settlements using low-resolution satellite images and machine learning.

method

results

Setup

  1. Install miniconda
  2. Create conda environment named ee
  3. Create conda environment named gdal_env then install gdal inside
  4. Install conda environment named repo_env from environment.yml

Notable dependencies include:

  • Ubuntu 16.04
  • Anaconda3-2019.10
  • earthengine-api==0.1.223
  • gdal==3.1.0
  • scikit-learn=0.21.3

Usage

python run.py --area=’riohacha’ --start 2021 --end 2021

where

  • area = municipality in these names
  • start = year to start collecting satellite images for rollout, exact date will be made Jan 1, {year}
  • end = year to end collecting satellite images for rollout, exact date will be made Dec 31, {year}

run.py consists of 3 scripts that accept area as a parameter:

  • download.py - acquires Sentinel2 images from Google Earth Engine
  • preprocess.py - deflates downloaded images and calculates indices
  • predict.py - generates settlement probability map

Code Organization

This repository is divided into three main parts:

  • data/: contains the informal settlement datasets; also the destination for downloaded satellite imagery
  • notebooks/: contains all Jupyter notebooks for data processing and model experimentation
  • utils/: contains utility scripts for geospatial data pre-processing and modeling

We evaluated model performance across different negative sampling parameters, and that is reflected on 10K, 30K, 50K, in the 3 instances of 03_Model_Optimization.ipynb

Data

For privacy concerns, we did not include in this repo the labelled training data that identified informal settlements in Colombia. If you need this dataset, please contact ThinkingMachines or IMMAP at hello@thinkingmachin.es, info@immap.org.

To use your own data, please:

  1. Save informal settlement polygon as GeoPackage "area_mask.gpkg"
  2. Save admin boundary for department/municipality as GeoPackage "area.gpkg"
  3. Download satellite images using notebooks/00_Data_Download, (instructions how, inside)
  4. Process the images using notebooks/01_Data_Preprocessing.

Resulting files and their directories should look like the following:

├── data
│   ├── pos_masks
│       ├── {area}_mask.gpkg
│   ├── admin_bounds
│       ├── {area}.gpkg
│   ├── images <derived>
│       ├── {area}_2015-2016.tif
│       ├── {area}_2017-2018.tif
│       ├── {area}_2019-2020.tif
│   ├── indices <derived>
│       ├── indices_{area}_2015-2016.tif
│       ├── indices_{area}_2017-2018.tif
│       ├── indices_{area}_2019-2020.tif

where area is the name of the area you're evaluating for as one word, e.g. Villa del Rosario -> villadelrosario.

Acknowledgments

This work is supported by the iMMAP Colombia.