/MSMatch

Code for the paper "MSMatch: Semi-Supervised Multispectral Scene Classification with Few Labels"

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

PWC PWC

MSMatch

Semi-Supervised Multispectral Scene Classification with Few Labels

Table of Contents
  1. About The Project
  2. Getting Started
  3. Content of Repository
  4. Usage
  5. Roadmap
  6. Contributing
  7. License
  8. FAQ
  9. Contact

About The Project

This is the code for the paper "MSMatch: Semi-Supervised Multispectral Scene Classification with Few Labels" by Pablo Gómez and Gabriele Meoni, which aims to apply the state of the art of semi-supervised learning techniques to land-use and land-cover classification problems. Currently, the repository includes an implementation of FixMatch for the training of EfficientNet Convolutional Neural Networks. The code builds on and extends the FixMatch-pytorch implementation based on PyTorch. Compared to the original repository, this repository includes code to work with both the RGB and the multispectral (MS) versions of EuroSAT dataset and the UC Merced Land Use (UCM) dataset.

Built With

Getting Started

This is a brief example of setting up MSMatch.

Prerequisites

We recommend using conda to set-up your environment. This will also automatically set up CUDA and the cudatoolkit for you, enabling the use of GPUs for training, which is recommended.

  • conda, which will take care of all requirements for you. For a detailed list of required packages, please refer to the conda environment file.

Installation

  1. Get miniconda or similar
  2. Clone the repo
    git clone https://github.com/gomezzz/MSMatch.git
  3. Setup the environment. This will create a conda environment called torchmatch
    conda env create -f environment.yml

Set up datasets

To launch the training on EuroSAT (rgb or MS), it is necessary to download the corresponding datasets. The root_dir variable in the corresponding datasets/eurosat_dataset.py and datasets/eurosat_rgb_dataset.py files shall be adjusted according to the dataset path.

Content of Repository

The repository is structured as follows:

  • datasets: contains the semi-supervised learning datasets usable for training, and augmentation code. To add a new dataset, a new class similar to, e.g., eurosat_rgb.pyneeds to be added.
  • external/visualizations: contains tools to create visualizations of trained models. We utilized the code from the src directory of pytorch-cnn-visualizations repository and slightly adapted it.
  • models: contains the neural networks models used for training.
  • notebooks: contains some jupyter notebooks used to create paper figures, collect training results, showing augmentation effects on images and provide additional functionalities. To be able to use the notebooks, it is necessary to additionally install Jupyter.
  • runscripts: includes bash scripts used to train the networks.
  • utils.py: some utility functions.
  • train_utils.py: providing utils for training.
  • train.py: main train script.
  • eval.py: main script for evaluating a trained network.
  • environment.yml: conda environment file describing dependencies.

Usage

Train a model

To train a model on EuroSAT RGB by using EfficientNet B0 from scratch, you can use:

python train.py --dataset eurosat_rgb --net efficientnet-b0

--net can be used to specify the EfficientNet model, whilst --dataset can be used to specify the dataset. Use eurosat_rgb for EuroSAT RGB, eurosat_ms for EuroSAT MS, and ucm for UCM dataset.

Instead of starting the training from scratch, it is possible exploit a model pretrained on ImageNet. To do it, you can use:

python train.py --dataset eurosat_rgb --net efficientnet-b0 --pretrained

Information on additional flags can be obtained by typing:

python train.py --help

For additional information on training, including the use of single/multiple GPUs, please refer to FixMatch-pytorch.

Evaluate a model

To evaluate a trained model on a target dataset, you can use:

python eval.py --load_path [LOAD_PATH] --dataset [DATASET] --net [NET]

where LOAD_PATH is the path of the trained model (.pth file), DATASET is the target dataset, NET is the network model used during the training.

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

The project is open to community contributions. Feel free to open an issue or write us an email if you would like to discuss a problem or idea first.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the GPL-3.0 License. See LICENSE for more information.

Contact

Created by ESA's Advanced Concepts Team

  • Pablo Gómez - pablo.gomez at esa.int
  • Gabriele Meoni - gabriele.meoni at esa.int

Project Link: https://www.esa.int/gsp/ACT/projects/semisupervised/