This library aims to help ML engineers to work with large satellite images. GeoDataProcessorv is fast python library
that crops large satellite images files to small tiles for passing it to model. Library takes as an input directory with
geo images and labels (as it shown in example of usage), makes clipping on input data and saves clipped tiles
to saving_folder/images
and saving_folder/labels
respectively.
- examples/preprocess_example.py
from pathlib import Path
from geodataset.fileutils.parse_directory import create_empty_folder
from geodataset.datasets import GeoImageDataset
def preprocess(images, labels, tile_size, saving_folder):
dataset = GeoImageDataset(image_dataset=images,
shp_dataset=labels)
dataset.clip_dataset(tile_size, output_directory=saving_folder)
def main():
images = Path("data/images")
labels = Path("data/labels")
saving_folder = Path("buildings_train/")
create_empty_folder(saving_folder)
tile_size = 512
preprocess(images, labels, tile_size, saving_folder)
if __name__ == '__main__':
main()
Input image with labels | Cropped image with labels |
---|---|
- locally:
git clone https://github.com/homomorfism/GeoDataProcessor
cd GeoDataProcessor
pip install .
- from pypl
pip install GeoDataProcessor==1.0
Run $ pytest
Data of segmentation of buildings were taken from aeronet tutorials