/buteo

Buteo - Geospatial Analysis Meets AI

Primary LanguagePythonMIT LicenseMIT

Buteo - Geospatial Analysis Meets AI

Buteo is a toolbox designed to simplify the process of working with geospatial data in machine learning. It includes tools for reading, writing, and processing geospatial data, as well as tools for creating labels from vector data and generating patches from geospatial data. Buteo makes it easy to ingest data, create training data, and perform inference on geospatial data.

Please note that Buteo is under active development, and its API may not be entirely stable. Feel free to report any bugs or suggest improvements.

When using, please pin the version of Buteo you are using to avoid breaking changes.

For documentation, visit: https://casperfibaek.github.io/buteo/

DOI

Fibaek, Casper. (2024). Buteo: Geospatial Data Analysis Framework for AI/EO. Zenodo. https://doi.org/10.5281/zenodo.7936577

Dependencies
numba (https://numba.pydata.org/)
gdal (https://gdal.org/)

Installation
Using pip:

pip install gdal
pip install buteo

Using conda:

conda install gdal
pip install buteo

Quickstart

Reproject (and other functions) to references. (Vector and raster)

import buteo as beo

OUTDIR = "path/to/output/dir"

vector_file_correct_projection = "path/to/vector/file.gpkg"
raster_files_wrong_projection = "path/to/raster/files/*.tif:glob"

paths_to_reprojected_rasters = beo.reproject_raster(
    raster_files_with_wrong_projection,
    vector_file_with_correct_projection,
    out_path=outdir
)

paths_to_reprojected_rasters
>>> [path/to/output/dir/file1.tif, path/to/output/dir/file2.tif, ...]

Align, stack, and make patches from rasters

import buteo as beo

SRCDIR = "path/to/src/dir/"

paths_to_aligned_rasters_in_memory = beo.align_rasters(
    SRCDIR + "*.tif:glob",
)

stacked_numpy_arrays = beo.raster_to_array(
    paths_to_aligned_rasters_in_memory,
)

patches = beo.array_to_patches(
    stacked_numpy_arrays,
    256,
    offsets_y=1, # 1 overlap at 1/2 patch size (128)
    offsets_x=1, # 1 overlap at 1/2 patch size (128)
)

# patches_nr, height, width, channels
patches
>>> np.ndarray([10000, 256, 256, 9])

Predict a raster using a model

import buteo as beo

RASTER_PATH = "path/to/raster/raster.tif"
RASTER_OUT_PATH = "path/to/raster/raster_pred.tif"

array = beo.raster_to_array(RASTER_PATH)

callback = model.predict # from pytorch, keras, etc..

# Predict the raster using overlaps, and borders.
# Merge using different methods. (median, mad, mean, mode, ...)
predicted = predict_array(
    array,
    callback,
    tile_size=256,
)

# Write the predicted raster to disk
beo.array_to_raster(
    predicted,
    reference=RASTER_PATH,
    out_path=RASTER_OUT_PATH,
)
# Path to the predicted raster
>>> "path/to/raster/raster_pred.tif"

Example Colabs
Create labels from OpenStreetMap data Open All Collab
Scheduled cleaning of geospatial data Open All Collab
Clip and remove noise from rasters Open All Collab
Sharpen nightlights data Open All Collab
Filters and morphological operations Open All Collab

The toolbox is being developed by ESA-Philab, NIRAS, and Aalborg University.

Dependencies

gdal numba

optional: orfeo-toolbox esa-snap

Design principles

Functions only return one type Use type hinting Keep the internal representation to a minimum Reduce the amount of dependencies