/openmapflow

Rapid map creation with machine learning and earth observation data.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

OpenMapFlow 🌍

CI Status Docker Status tb1 db1 tb2 db2 tb3 db3

Rapid map creation with machine learning and earth observation data.

Examples: Cropland, Buildings, Maize

3maps-gif

Tutorial cb

Colab notebook tutorial demonstrating data exploration, model training, and inference over small region.

Prerequisites:

  • Github account
  • Github access token (obtained here)
  • Forked OpenMapFlow repository
  • Basic Python knowledge

Generating a project cb

Inside a Github repository run:

pip install openmapflow
openmapflow generate

This generates a project for: Adding data ➞ Training a model ➞ Creating a map

Adding data cb

Move raw labels into project:

export RAW_LABEL_DIR=$(openmapflow datapath RAW_LABELS)
mkdir RAW_LABEL_DIR/<my dataset name>
cp -r <path to my raw data files> RAW_LABEL_DIR/<my dataset name>

Add reference to data using a LabeledDataset object in datasets.py:

datasets = [
    LabeledDataset(
        dataset="example_dataset",
        country="Togo",
        raw_labels=(
            RawLabels(
                filename="Togo_2019.csv",
                longitude_col="longitude",
                latitude_col="latitude",
                class_prob=lambda df: df["crop"],
                start_year=2019,
                x_y_from_centroid=False,
            ),
        ),
    ),
    ...
]

Run feature creation:

earthengine authenticate    # For getting new earth observation data
gcloud auth login           # For getting cached earth observation data

openmapflow create-features # Initiatiates or checks progress of features creation
# May take long time depending on amount of labels in dataset 
# TODO make the end more obvious

openmapflow datasets        # Shows the status of datasets

dvc commit && dvc push      # Push new data to data version control

git add .
git commit -m'Created new features'
git push

Training a model cb

# Pull in latest data
dvc pull    
tar -xzf $(openmapflow datapath COMPRESSED_FEATURES) -C data

export MODEL_NAME=<model_name>              # Set model name
python train.py --model_name $MODEL_NAME    # Train a model
python evaluate.py --model_name $MODEL_NAME # Record test metrics

dvc commit && dvc push  # Push new models to data version control

git checkout -b"$MODEL_NAME"
git add .
git commit -m "$MODEL_NAME"
git push --set-upstream origin "$MODEL_NAME"

Creating a map cb

Only available through Colab. Cloud Architecture must be deployed using the deploy.yaml Github Action.