🚀 Presented at the IEEE International Geoscience and Remote Sensing Symposium (IGARSS) 2023 Conference in Pasadena, California, USA 🚀
Road network predictions on a small region of Las Vegas with geo-referenced (1300x1300x3) RGB images from ArcGIS World Imagery Map Service (ground resolution of 0.3 m2 / pixel) using QGIS. Image below is (9x6) grid.
Areal inference speed: ~650km2 / hour / GPU
Paper link: Graph Reasoned Multi-Scale Road Segmentation in Remote Sensing Imagery
Download:
DeepGlobe (Kaggle account required),
MassachusettsRoads (Kaggle account required),
Spacenet (AWS account required).
Once you download either the DeepGlobe or the Massachusetts Roads datasets, extract their contents into a "DeepGlobe" or "MassachusettsRoads" folder respectively in the Datasets folder.
For Spacenet, the procedure is a bit more involved...
We need the images in 8-bit format.
After downloading AOIs 2-5 (Vegas, Paris, Shanghai, Khartoum), go to the CRESI repository and select "SpaceNet 5 Baseline Part 1 - Data Prep".
Use create_8bit_masks.py as described in the link. Then use speed_masks.py to create continuous masks. Binarize these masks between [0,1] and place them in /Datasets/Spacenet/trainval_labels/train_masks/
Next, locate the "PS-MS"
folder in each corresponding AOI_#_<city>
directory.
Move all image files in each of these "PS-MS" folders to /Datasets/Spacenet/trainval/
.
Like-wise, locate the "MUL-PanSharpen"
folder in each corresponding AOI_#_<city>_Roads_Test_Public
directory and move all of these image files to /Datasets/Spacenet/test/
Create an environment with anaconda: conda create --name <your_env_name> python=3.9
Next, activate your environment: conda activate <your_env_name>
Install dependencies from pip: pip install -r requirements.txt
Install dependencies from conda:
conda install gdal
conda install pytorch=1.13.0 torchvision=0.14 pytorch-cuda=11.6 -c pytorch -c nvidia
Now we will create our cropped images for each train/val/test part (where applicable) of a chosen Dataset.
In the console enter: python setup.py -d Datasets -cs 512 -j <name of dataset>
(-cs is the crop-size)
The dataset name should be identical to the ones in the Dataset Instructions section. Wait approximately ~15 minutes.
Cropped Image Disk Space:
DeepGlobe ~= 24.3GB
MassachusettsRoads ~= 9.71GB
Spacenet ~= 25GB
All training was performed on a single NVIDIA GeForce RTX 2080 Ti (11GB VRAM).
See the cfg.json
file to ensure that the training settings are appropriate for your rig.
To train the model from scratch, run:
python train.py -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name>
Example
python train.py -m ConvNeXt_UPerNet_DGCN_MTL -d MassachusettsRoads -e MassachusettsRoadsTo resume the training of a model:
python train.py -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -r ./Experiments/<experiment_name>/model_best.pth.tar
To fine-tune a pre-trained model on a new dataset:
python train.py -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -rd ./Experiments/<experiment_name>/model_best.pth.tar
For example, one can use pre-trained MassachusettsRoads model weights to start training for DeepGlobe or Spacenet to speed up convergence.
Backup your log files (*.txt) in ./Experiments/<experiment_name>/
Once training ends (Default: 120 epochs), to evaluate Precision, Recall, F1, IoU(relaxed) IoU(accurate) metrics run:
python eval.py -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -r ./Experiments/<experiment_name>/model_best.pth.tar
The evaluation script uses elements from the utils folder of [3].
This will create a ./Experiments/<experiment_name>/images_eval
folder with each file showing (clock-wise) the original image, its label, a feature heat-map and the stitched prediction. Note, that for MassachusettsRoads, use a validation setting batch_size of 3 (cfg.json) when creating the images.
To evaluate the APLS metric refer to this link.
You may also refer to this link for better viewing.
6. REFERENCES
[1] N. Weir et al., “SpaceNet MVOI: A Multi-View Overhead Imagery Dataset”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 992-1001, doi: 10.1109/ICCV.2019.00108.[2] I. Demir et al., “DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images”, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 172-17209, doi: 10.1109/CVPRW.2018.00031.
[3] A. Batra, S. Singh, G. Pang, S. Basu, C. V. Jawahar and M. Paluri, “Improved Road Connectivity by Joint Learning of Orientation and Segmentation”, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10377-10385, doi: 10.1109/CVPR.2019.01063.
[4] L. Zhang et al., “Dual Graph Convolutional Network for Semantic Segmentation”, 2019 British Machine Vision Conference (BMVC), 2019, https://doi.org/10.48550/arXiv.1909.06121.
[5] Z. Liu, H. Mao, C.Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, “A ConvNet for the 2020s”, 2022 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 11976-11986
[6] T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, “Unified perceptual parsing for scene understanding”. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 432–448. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_26
[7] A. Etten, D. Lindenbaum, T. Bacastow, “SpaceNet: A Remote Sensing Dataset and Challenge Series”, 2018, https://doi.org/10.48550/arXiv.1807.01232
[8] V. Mnih, “Machine Learning for Aerial Image Labeling”, PhD Dissertation, University of Toronto, 2013.
[9] W.G.C. Bandara, J.M.J. Valanarasu, V.M .Patel, “Spin road mapper: extracting roads from aerial images via spatial and interaction space graph reasoning for autonomous driving”. arXiv preprint arXiv:2109.07701 (2021)