How to use this repository: if you know exactly what you are looking for (e.g. you have the paper name) you can Control+F
to search for it in this page (or search in the raw markdown).
- awesome-satellite-imagery-datasets
- Awesome_Satellite_Benchmark_Datasets
- awesome-remote-sensing-change-detection -> dedicated to change detection
- Callisto-Dataset-Collection -> datasets that use Copernicus/sentinel data
- geospatial-data-catalogs -> A list of open geospatial datasets available on AWS, Earth Engine, Planetary Computer, and STAC Index
- BED4RS
- Satellite-Image-Time-Series-Datasets
- Radiant MLHub -> both datasets and models
- Registry of Open Data on AWS
- Microsoft Planetary Computer data catalog
- Google Earth Engine Data Catalog
As part of the EU Copernicus program, multiple Sentinel satellites are capturing imagery -> see wikipedia
- awesome-sentinel -> a curated list of awesome tools, tutorials and APIs related to data from the Copernicus Sentinel Satellites.
- Sentinel-2 Cloud-Optimized GeoTIFFs and Sentinel-2 L2A 120m Mosaic
- Open access data on GCP
- Paid access to Sentinel & Landsat data via sentinel-hub and python-api
- Example loading sentinel data in a notebook
- so2sat on Tensorflow datasets - So2Sat LCZ42 is a dataset consisting of co-registered synthetic aperture radar and multispectral optical image patches acquired by the Sentinel-1 and Sentinel-2 remote sensing satellites, and the corresponding local climate zones (LCZ) label. The dataset is distributed over 42 cities across different continents and cultural regions of the world.
- BigEarthNet - The BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1.2 km with variable image size depending on the channel resolution. This is a multi-label dataset with 43 imbalanced labels. Also available in torchgeo
- Jupyter Notebooks for working with Sentinel-5P Level 2 data stored on S3. The data can be browsed here
- Sentinel NetCDF data
- Analyzing Sentinel-2 satellite data in Python with Keras
- Xarray backend to Copernicus Sentinel-1 satellite data products
- SEN2VENµS -> a dataset for the training of Sentinel-2 super-resolution algorithms
- SEN12MS -> A Curated Dataset of Georeferenced Multi-spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion. Checkout SEN12MS toolbox and many referenced uses on paperswithcode.com
- Sen4AgriNet -> A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning, with website and models
- earthspy -> Monitor and study any place on Earth and in Near Real-Time (NRT) using the Sentinel Hub services developed by the EO research team at Sinergise
- Space2Ground -> dataset with Space (Sentinel-1/2) and Ground (street-level images) components, annotated with crop-type labels for agriculture monitoring.
- sentinel2tools -> downloading & basic processing of Sentinel 2 imagesry. Read Sentinel2tools: simple lib for downloading Sentinel-2 satellite images
- open-sentinel-map -> The OpenSentinelMap dataset contains Sentinel-2 imagery and per-pixel semantic label masks derived from OpenStreetMap
- MSCDUnet -> change detection datasets containing VHR, multispectral (Sentinel-2) and SAR (Sentinel-1)
- OMBRIA -> Sentinel-1 & 2 dataset for adressing the flood mapping problem
- Canadian-cropland-dataset -> a novel patch-based dataset compiled using optical satellite images of Canadian agricultural croplands retrieved from Sentinel-2
- Sentinel-2 Cloud Cover Segmentation Dataset on Radiant mlhub
- The Azavea Cloud Dataset which is used to train this cloud-model
- fMoW-Sentinel -> The Functional Map of the World - Sentinel-2 corresponding images (fMoW-Sentinel) dataset consists of image time series collected by the Sentinel-2 satellite, corresponding to locations from the Functional Map of the World (fMoW) dataset across several different times. Used in SatMAE
- Earth Surface Water Dataset -> a dataset for deep learning of surface water features on Sentinel-2 satellite images. See this ref using it in torchgeo
- Ship-S2-AIS dataset -> 13k tiles extracted from 29 free Sentinel-2 products. 2k images showing ships in Denmark sovereign waters: one may detect cargos, fishing, or container ships
- Amazon Rainforest dataset for semantic segmentation -> Sentinel 2 images. Used in An attention-based U-Net for detecting deforestation within satellite sensor imagery
- Amazon and Atlantic Forest image datasets for semantic segmentation -> Sentinel 2 images. Used in An attention-based U-Net for detecting deforestation within satellite sensor imagery
- Mining and clandestine airstrips datasets
- Satellite Burned Area Dataset -> segmentation dataset containing several satellite acquisitions related to past forest wildfires. It contains 73 acquisitions from Sentinel-2 and Sentinel-1 (Copernicus).
- mmflood -> Flood delineation from Sentinel-1 SAR imagery, with paper
- MATTER -> a Sentinel 2 dataset for Self-Supervised Training
- Industrial Smoke Plumes
- MARIDA: Marine Debris Archive
- S2GLC -> High resolution Land Cover Map of Europe
- Generating Imperviousness Maps from Multispectral Sentinel-2 Satellite Imagery
- Sentinel-2 Water Edges Dataset (SWED)
- Sentinel-1 for Science Amazonas -> forest lost time series dataset
Long running US program -> see Wikipedia
- 8 bands, 15 to 60 meters, 185km swath, the temporal resolution is 16 days
- Landsat 4, 5, 7, and 8 imagery on Google, see the GCP bucket here, with Landsat 8 imagery in COG format analysed in this notebook
- Landsat 8 imagery on AWS, with many tutorials and tools listed
- https://github.com/kylebarron/landsat-mosaic-latest -> Auto-updating cloudless Landsat 8 mosaic from AWS SNS notifications
- Visualise landsat imagery using Datashader
- Landsat-mosaic-tiler -> This repo hosts all the code for landsatlive.live website and APIs.
Satellites owned by Maxar (formerly DigitalGlobe) include GeoEye-1, WorldView-2, 3 & 4
- Maxar Open Data Program provides pre and post-event high-resolution satellite imagery in support of emergency planning, response, damage assessment, and recovery
- WorldView-2 European Cities -> dataset covering the most populated areas in Europe at 40 cm resolution
- Planet’s high-resolution, analysis-ready mosaics of the world’s tropics, supported through Norway’s International Climate & Forests Initiative. BBC coverage
- Planet have made imagery available via kaggle competitions
Land use classification dataset with 21 classes and 100 RGB TIFF images for each class. Each image measures 256x256 pixels with a pixel resolution of 1 foot
- http://weegee.vision.ucmerced.edu/datasets/landuse.html
- Available as a Tensorflow dataset -> https://www.tensorflow.org/datasets/catalog/uc_merced
- Also available as a multi-label dataset
- Read Vision Transformers for Remote Sensing Image Classification where a Vision Transformer classifier achieves 98.49% classification accuracy on Merced
Land use classification dataset of Sentinel-2 satellite images covering 13 spectral bands and consisting of 10 classes with 27000 labeled and geo-referenced samples. Available in RGB and 13 band versions
- EuroSAT: Land Use and Land Cover Classification with Sentinel-2 -> publication where a CNN achieves a classification accuracy 98.57%
- Repos using fastai here and here
- evolved_channel_selection -> explores the trade off between mixed resolutions and whether to use a channel at all, with repo
- RGB version available as dataset in pytorch with the 13 band version in torchgeo. Checkout the tutorial on data augmentation with this dataset
- RGB and 13 band versions in tensorflow
Land use classification dataset with 38 classes and 800 RGB JPG images for each class
- https://sites.google.com/view/zhouwx/dataset?authuser=0
- Publication: PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval
- https://captain-whu.github.io/GID/
- a large-scale classification set and a fine land-cover classification set
A large-scale benchmark dataset containing million instances for RS scene classification, 51 scene categories organized by the hierarchical category
- https://captain-whu.github.io/DiRS/
- Pretrained models
- Also see AID, AID-Multilabel-Dataset & DFC15-multilabel-dataset
A large-scale benchmark dataset for object detection in optical remote sensing images, which consists of 23,463 images and 192,518 object instances annotated with horizontal bounding boxes
- https://gcheng-nwpu.github.io/
- https://arxiv.org/abs/1909.00133
- ors-detection -> Object Detection on the DIOR dataset using YOLOv3
- dior_detect -> benchmarks for object detection on DIOR dataset
- Tools -> for dealing with the DIOR
MultiScene dataset aims at two tasks: Developing algorithms for multi-scene recognition & Network learning with noisy labels
A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery
- arxiv papr
- Download at gaofen-challenge.com
- 2020Gaofen -> 2020 Gaofen Challenge data, baselines, and metrics
A Large-Scale Benchmark and Challenges for Object Detection in Aerial Images. Segmentation annotations available in iSAID dataset
- https://captain-whu.github.io/DOTA/index.html
- DOTA_devkit for loading dataset
- Arxiv paper
- Pretrained models in mmrotate
- DOTA2VOCtools -> dataset split and transform to voc format
- dotatron -> 2021 Learning to Understand Aerial Images Challenge on DOTA dataset
A Large-scale Dataset for Instance Segmentation in Aerial Images
- https://captain-whu.github.io/iSAID/dataset.html
- Uses images from the DOTA dataset
- Object Detection in Aerial Imagery -> shows the performance of two-stage, one-stage and attention based object detectors on the iSAID dataset
- https://www.kaggle.com/datasets/guofeng/hrsc2016
- Pretrained models in mmrotate
- Rotation-RetinaNet-PyTorch
A dataset for tiny ship detection under medium-resolution remote sensing images. Annotations in bounding box format
- Hosted on Nucleus
2966 non-overlapped 224×224 slices are collected with 7835 aircraft targets
A fine-grained object detection dataset with 60 object classes along an ontology of 8 class types. Over 1,000,000 objects across over 1,400 km^2 of 0.3m resolution imagery. Annotations in bounding box format
- Official website
- arXiv paper.
- paperswithcode
- Satellite_Imagery_Detection_YOLOV7 -> YOLOV7 applied to xView1
Annotated high-resolution satellite imagery for building damage assessment, precise segmentation masks and damage labels on a four-level spectrum, 0.3m resolution imagery
- Official website
- arXiv paper
- paperswithcode
- xView2_baseline -> baseline solution in tensorflow
- metadamagenet -> pytorch solution
- U-Net models from michal2409
- DAHiTra -> code for 2022 paper: Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images. Uses xView2 xBD dataset
Detecting dark vessels engaged in illegal, unreported, and unregulated (IUU) fishing activities on synthetic aperture radar (SAR) imagery. With human and algorithm annotated instances of vessels and fixed infrastructure across 43,200,000 km^2 of Sentinel-1 imagery, this multi-modal dataset enables algorithms to detect and classify dark vessels
- Official website
- arXiv paper
- Github -> all reference code, dataset processing utilities, and winning model codes + weights
- paperswithcode
- xview3_ship_detection
Vehicle Detection in Aerial Imagery. Bounding box annotations
Large set of annotated cars from overhead. Established baseline for object detection and counting tasks. Annotations in bounding box format
- http://gdo152.ucllnl.org/cowc/
- https://github.com/LLNL/cowc
- Detecting cars from aerial imagery for the NATO Innovation Challenge
The mean size of objects in AI-TOD is about 12.8 pixels, which is much smaller than other datasets. Annotations in bounding box format
- https://github.com/jwwangchn/AI-TOD
- NWD -> code for 2021 paper: A Normalized Gaussian Wasserstein Distance for Tiny Object Detection. Uses AI-TOD dataset
- AI-TOD-v2 -> meticulously relabelling of the v1 dataset
- RarePlanes -> incorporates both real and synthetically generated satellite imagery including aircraft. Read the arxiv paper and checkout this repo. Note the dataset is available through the AWS Open-Data Program for free download
- Understanding the RarePlanes Dataset and Building an Aircraft Detection Model -> blog post
- Read this article from NVIDIA which discusses fine tuning a model pre-trained on synthetic data (Rareplanes) with 10% real data, then pruning the model to reduce its size, before quantizing the model to improve inference speed
- yoltv4 includes examples on the RarePlanes dataset
- rareplanes-yolov5 -> using YOLOv5 and the RarePlanes dataset to detect and classify sub-characteristics of aircraft, with article
A Large-scale Dataset for Remote Sensing Object Counting and A Benchmark Method
Public dataset for roof segmentation from very-high-resolution aerial imagery (7.5cm). Covers almost the full area of Christchurch, the largest city in the South Island of New Zealand.
- On Kaggle
- Rooftop-Instance-Segmentation -> VGG-16, Instance Segmentation, uses the Airs dataset
RGB GeoTIFF at spatial resolution of 0.3 m. Data covering Austin, Chicago, Kitsap County, Western & Easter Tyrol, Innsbruck, San Francisco & Vienna
- https://project.inria.fr/aerialimagelabeling/contest/
- SemSegBuildings -> Project using fast.ai framework for semantic segmentation on Inria building segmentation dataset
- UNet_keras_for_RSimage -> keras code for binary semantic segmentation
300x300 pixel RGB images with annotations in COCO format. Imagery appears to be global but with significant fraction from North America
- Dataset release as part of the mapping-challenge
- Winning solution published by neptune.ai here, achieved precision 0.943 and recall 0.954 using Unet with Resnet.
- mappingchallenge -> YOLOv5 applied to the AICrowd Mapping Challenge dataset
BONAI (Buildings in Off-Nadir Aerial Images) is a dataset for building footprint extraction (BFE) in off-nadir aerial images
- https://justchenhao.github.io/LEVIR/
- FCCDN_pytorch -> pytorch implemention of FCCDN for change detection task
- RSICC -> the Remote Sensing Image Change Captioning dataset uses LEVIR-CD imagery
It comprises 24 pairs of multispectral images taken from the Sentinel-2 satellites between 2015 and 2018.
- Onera Satellite Change Detection Dataset comprises 24 pairs of multispectral images taken from the Sentinel-2 satellites between 2015 and 2018
- Website
- change_detection_onera_baselines -> Siamese version of U-Net baseline model
- Urban Change Detection for Multispectral Earth Observation Using Convolutional Neural Networks -> with paper
- DS_UNet -> code for 2021 paper: Sentinel-1 and Sentinel-2 Data Fusion for Urban Change Detection using a Dual Stream U-Net, uses Onera Satellite Change Detection dataset
- ChangeDetection_wOnera
- https://captain-whu.github.io/SCD/
- Change detection at the pixel level
Semantic segmentation dataset. 38 patches of 6000x6000 pixels, each consisting of a true orthophoto (TOP) extracted from a larger TOP mosaic, and a DSM. Resolution 5 cm
SpaceNet is a series of competitions with datasets and utilities provided. The challenges covered are: (1 & 2) building segmentation, (3) road segmentation, (4) off-nadir buildings, (5) road network extraction, (6) multi-senor mapping, (7) multi-temporal urban change, (8) Flood Detection Challenge Using Multiclass Segmentation
- spacenet.ai is an online hub for data, challenges, algorithms, and tools
- The SpaceNet 7 Multi-Temporal Urban Development Challenge: Dataset Release
- spacenet-three-topcoder solution
- official utilities -> Packages intended to assist in the preprocessing of SpaceNet satellite imagery dataset to a format that is consumable by machine learning algorithms
- andraugust spacenet-utils -> Display geotiff image with building-polygon overlay & label buildings using kNN on the pixel spectra
- Spacenet-Building-Detection -> uses keras and Spacenet 1 dataset
- Spacenet 8 winners blog post
Nearly 10,000 km² of free high-resolution satellite imagery of unique locations which ensure stratified representation of all types of land-use across the world: from agriculture to ice caps, from forests to multiple urbanization densities.
- https://github.com/worldstrat/worldstrat
- Quick tour of the WorldStrat Dataset
- Each high-resolution image (1.5 m/pixel) comes with multiple temporally-matched low-resolution images from the freely accessible lower-resolution Sentinel-2 satellites (10 m/pixel)
- Several super-resolution benchmark models trained on it
A Large-Scale, Multi-Task Dataset for Remote Sensing Image Understanding. Annotates all modalities (classification, segmentation, object detection etc)
- Website
- Dataset release in January 2023
- https://ignf.github.io/FLAIR/
- The FLAIR-one semantic segmentation dataset consists of 77,412 high resolution patches (512x512 at 0.2 m spatial resolution) with 19 semantic classes
RF100 is compiled from 100 real world datasets that straddle a range of domains. The aim is that performance evaluation on this dataset will enable a more nuanced guide of how a model will perform in different domains. Contains 10k aerial images
- resisc45 -> RESISC45 dataset is a publicly available benchmark for Remote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class.
- eurosat -> EuroSAT dataset is based on Sentinel-2 satellite images covering 13 spectral bands and consisting of 10 classes with 27000 labeled and geo-referenced samples.
- BigEarthNet -> a large-scale Sentinel-2 land use classification dataset, consisting of 590,326 Sentinel-2 image patches. The image patch size on the ground is 1.2 x 1.2 km with variable image size depending on the channel resolution. This is a multi-label dataset with 43 imbalanced labels. Official website includes version of the dataset with Sentinel 1 & 2 chips
- so2sat -> a dataset consisting of co-registered synthetic aperture radar and multispectral optical image patches acquired by Sentinel 1 & 2
- US Building Footprints -> building footprints in all 50 US states, GeoJSON format, generated using semantic segmentation. Also Australia, Canadian, Uganda-Tanzania, Kenya-Nigeria and GlobalMLBuildingFootprints are available. Checkout RasterizingBuildingFootprints to convert vector shapefiles to raster layers
- Microsoft Planetary Computer is a Dask-Gateway enabled JupyterHub deployment focused on supporting scalable geospatial analysis, source repo
- landcover-orinoquia -> Land cover mapping of the Orinoquía region in Colombia, in collaboration with Wildlife Conservation Society Colombia. An #AIforEarth project
- RoadDetections dataset by Microsoft
- open-buildings -> A dataset of building footprints to support social good applications covering 64% of the African continent. Read Mapping Africa’s Buildings with Satellite Imagery
Since there is a whole community around GEE I will not reproduce it here but list very select references. Get started at https://developers.google.com/earth-engine/
- Various imagery and climate datasets, including Landsat & Sentinel imagery
- Supports large scale processing with classical algorithms, e.g. clustering for land use. For deep learning, you export datasets from GEE as tfrecords, train on your preferred GPU platform, then upload inference results back to GEE
- awesome-google-earth-engine
- Awesome-GEE
- awesome-earth-engine-apps
- How to Use Google Earth Engine and Python API to Export Images to Roboflow -> to acquire training data
- ee-fastapi is a simple FastAPI web application for performing flood detection using Google Earth Engine in the backend.
- How to Download High-Resolution Satellite Data for Anywhere on Earth
- wxee -> Export data from GEE to xarray using wxee then train with pytorch or tensorflow models. Useful since GEE only suports tfrecord export natively
- RSICD -> 10921 images with five sentences descriptions per image. Used in Fine tuning CLIP with Remote Sensing (Satellite) images and captions, models at this repo
- RSICC -> the Remote Sensing Image Change Captioning dataset contains 10077 pairs of bi-temporal remote sensing images and 50385 sentences describing the differences between images. Uses LEVIR-CD imagery
- NASA (make request and emailed when ready) -> https://search.earthdata.nasa.gov
- NOAA (requires BigQuery) -> https://www.kaggle.com/noaa/goes16/home
- Time series weather data for several US cities -> https://www.kaggle.com/selfishgene/historical-hourly-weather-data
- DeepWeather -> improve weather forecasting accuracy by analyzing satellite images
- Planet-CR -> A Multi-Modal and Multi-Resolution Dataset for Cloud Removal in High Resolution Optical Remote Sensing Imagery, 3m resolution, with paper
- The Azavea Cloud Dataset which is used to train this cloud-model
- Sentinel-2 Cloud Cover Segmentation Dataset on Radiant mlhub
- cloudsen12 -> see video
- HRC_WHU -> High-Resolution Cloud Detection Dataset comprising 150 RGB images and a resolution varying from 0.5 to 15 m in different global regions
- AIR-CD -> a challenging cloud detection data set called AIR-CD, with higher spatial resolution and more representative landcover types
- Landsat 8 Cloud Cover Assessment Validation Data
- awesome-forests -> A curated list of ground-truth forest datasets for the machine learning and forestry community
- ReforesTree -> A dataset for estimating tropical forest biomass based on drone and field data
- yosemite-tree-dataset -> a benchmark dataset for tree counting from aerial images
- Amazon Rainforest dataset for semantic segmentation -> Sentinel 2 images. Used in An attention-based U-Net for detecting deforestation within satellite sensor imagery
- Amazon and Atlantic Forest image datasets for semantic segmentation -> Sentinel 2 images. Used in An attention-based U-Net for detecting deforestation within satellite sensor imagery
- Resource Watch provides a wide range of geospatial datasets and a UI to visualise them
- BreizhCrops -> A Time Series Dataset for Crop Type Mapping
- The SeCo dataset contains image patches from Sentinel-2 tiles captured at different timestamps at each geographical location. Download SeCo here
- SYSU-CD -> The dataset contains 20000 pairs of 0.5-m aerial images of size 256×256 taken between the years 2007 and 2014 in Hong Kong
- Shuttle Radar Topography Mission, search online at usgs.gov
- Copernicus Digital Elevation Model (DEM) on S3, represents the surface of the Earth including buildings, infrastructure and vegetation. Data is provided as Cloud Optimized GeoTIFFs. link
- Awesome-DEM
- Many on https://www.visualdata.io
- AU-AIR dataset -> a multi-modal UAV dataset for object detection.
- ERA -> A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos.
- Aerial Maritime Drone Dataset -> bounding boxes
- RetinaNet for pedestrian detection -> bounding boxes
- Dataset of thermal and visible aerial images for multi-modal and multi-spectral image registration and fusion -> The dataset consists of 30 visible images and their metadata, 80 thermal images and their metadata, and a visible georeferenced orthoimage.
- BIRDSAI: A Dataset for Detection and Tracking in Aerial Thermal Infrared Videos -> Thermal IR videos of humans and animals. With Github repo
- ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos
- DroneVehicle -> Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning. Annotations are rotated bounding boxes. With Github repo
- UAVOD10 -> 10 class of objects at 15 cm resolution. Classes are; building, ship, vehicle, prefabricated house, well, cable tower, pool, landslide, cultivation mesh cage, and quarry. Bounding boxes
- Busy-parking-lot-dataset---vehicle-detection-in-UAV-video -> Vehicle instance segmentation. Unsure format of annotations, possible Matlab specific
- dd-ml-segmentation-benchmark -> DroneDeploy Machine Learning Segmentation Benchmark
- SeaDronesSee -> Vision Benchmark for Maritime Search and Rescue. Bounding box object detection, single-object tracking and multi-object tracking annotations
- aeroscapes -> semantic segmentation benchmark comprises of images captured using a commercial drone from an altitude range of 5 to 50 metres.
- ALTO -> Aerial-view Large-scale Terrain-Oriented. For deep learning based UAV visual place recognition and localization tasks.
- HIT-UAV-Infrared-Thermal-Dataset -> A High-altitude Infrared Thermal Object Detection Dataset for Unmanned Aerial Vehicles
- land-use-land-cover-datasets
- EORSSD-dataset -> Extended Optical Remote Sensing Saliency Detection (EORSSD) Dataset
- RSD46-WHU -> 46 scene classes for image classification, free for education, research and commercial use
- RSOD-Dataset -> dataset for object detection in PASCAL VOC format. Aircraft, playgrounds, overpasses & oiltanks
- VHR-10_dataset_coco -> Object detection and instance segmentation dataset based on NWPU VHR-10 dataset. RGB & SAR
- HRSID -> high resolution sar images dataset for ship detection, semantic segmentation, and instance segmentation tasks
- MAR20 -> Military Aircraft Recognition dataset
- RSSCN7 -> Dataset of the article “Deep Learning Based Feature Selection for Remote Sensing Scene Classification”
- Sewage-Treatment-Plant-Dataset -> object detection
- TGRS-HRRSD-Dataset -> High Resolution Remote Sensing Detection (HRRSD)
- MUSIC4HA -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Hot Area
- MUSIC4GC -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Golf Course
- MUSIC4P3 -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Photovoltaic Power Plants (solar panels)
- ABCDdataset -> damage detection dataset to identify whether buildings have been washed-away by tsunami
- OGST -> Oil and Gas Tank Dataset
- LS-SSDD-v1.0-OPEN -> Large-Scale SAR Ship Detection Dataset
- S2Looking -> A Satellite Side-Looking Dataset for Building Change Detection, paper
- Zurich Summer Dataset -> Semantic segmentation of urban scenes
- AISD -> Aerial Imagery dataset for Shadow Detection
- Awesome-Remote-Sensing-Relative-Radiometric-Normalization-Datasets
- SearchAndRescueNet -> Satellite Imagery for Search And Rescue Dataset, with example Faster R-CNN model
- geonrw -> orthorectified aerial photographs, LiDAR derived digital elevation models and segmentation maps with 10 classes. With repo
- Thermal power plans dataset
- University1652-Baseline -> A Multi-view Multi-source Benchmark for Drone-based Geo-localization
- benchmark_ISPRS2021 -> A new stereo dense matching benchmark dataset for deep learning
- WHU-SEN-City -> A paired SAR-to-optical image translation dataset which covers 34 big cities of China
- SAR_vehicle_detection_dataset -> 104 SAR images for vehicle detection, collected from Sandia MiniSAR/FARAD SAR images and MSTAR images
- ERA-DATASET -> A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos
- SSL4EO-S12 -> a large-scale dataset for self-supervised learning in Earth observation
- UBC-dataset -> a dataset for building detection and classification from very high-resolution satellite imagery with the focus on object-level interpretation of individual buildings
- AIR-CD -> a challenging cloud detection data set called AIR-CD, with higher spatial resolution and more representative landcover types
- AIR-PolSAR-Seg -> a challenging PolSAR terrain segmentation dataset
- HRC_WHU -> High-Resolution Cloud Detection Dataset comprising 150 RGB images and a resolution varying from 0.5 to 15 m in different global regions
- AeroRIT -> A New Scene for Hyperspectral Image Analysis
- Building_Dataset -> High-speed Rail Line Building Dataset Display
- Haiming-Z/MtS-WH-reference-map -> a reference map for change detection based on MtS-WH
- MtS-WH-Dataset -> Multi-temporal Scene WuHan (MtS-WH) Dataset
- Multi-modality-image-matching -> image matching dataset including several remote sensing modalities
- RID -> Roof Information Dataset for CV-Based Photovoltaic Potential Assessment. With paper
- APKLOT -> A dataset for aerial parking block segmentation
- QXS-SAROPT -> Optical and SAR pairing dataset from the paper: The QXS-SAROPT Dataset for Deep Learning in SAR-Optical Data Fusion
- SAR-ACD -> SAR-ACD consists of 4322 aircraft clips with 6 civil aircraft categories and 14 other aircraft categories
- SODA -> A large-scale Small Object Detection dataset. SODA-A comprises 2510 high-resolution images of aerial scenes, which has 800203 instances annotated with oriented rectangle box annotations over 9 classes.
- Data-CSHSI -> Open source datasets for Cross-Scene Hyperspectral Image Classification, includes Houston, Pavia & HyRank datasets
- SynthWakeSAR -> A Synthetic SAR Dataset for Deep Learning Classification of Ships at Sea, with paper
- SAR2Opt-Heterogeneous-Dataset -> SAR-optical images to be used as a benchmark in change detection and image transaltion on remote sensing images
- urban-tree-detection-data -> Dataset for training and evaluating tree detectors in urban environments with aerial imagery
- Landsat 8 Cloud Cover Assessment Validation Data
- Attribute-Cooperated-Classification-Datasets -> Three datasets based on AID, UCM, and Sydney. For each image, there is a label of scene classification and a label vector of attribute items.
- dynnet -> DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation
- open_earth_map -> a benchmark dataset for global high-resolution land cover mapping
- Satellite imagery datasets containing ships -> A list of radar and optical satellite datasets for ship detection, classification, semantic segmentation and instance segmentation tasks
- SolarDK -> A high-resolution urban solar panel image classification and localization dataset
- Roofline-Extraction -> dataset for paper 'Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)'
- Building-detection-and-roof-type-recognition -> datasets for the paper 'A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image'
- PanCollection -> Pansharpening Datasets from WorldView 2, WorldView 3, QuickBird, Gaofen 2 sensors
- OnlyPlanes -> Synthetic dataset and pretrained models for Detectron2
- Remote Sensing Satellite Video Dataset for Super-resolution
- WHU-Stereo -> A Challenging Benchmark for Stereo Matching of High-Resolution Satellite Images
- BH-POOLS & BH-WATERTANKS -> segmentation dataset of swimming pools and water tanks in Brazil
- BrazilDAM Dataset -> a multi sensor (Landsat 8 and Sentinel 2) and multitemporal dataset that consists of multispectral images of ore tailings dams throughout Brazil
- Bridge Dataset -> 500 images each containing at least one bridge
- Brazilian Cerrado-Savanna Scenes Dataset -> 1,311 multi-spectral scenes extracted from images acquired by the RapidEye are partitioned into 4 classes: Agriculture, Arboreal, Herbaceous and Shrubby Vegetation
- Brazilian Coffee Scenes Dataset
- FireRisk -> A Remote Sensing Dataset for Fire Risk Assessment with Benchmarks Using Supervised and Self-supervised Learning
- Road-Change-Detection-Dataset
- 3DCD -> infer 3D CD maps using only remote sensing optical bitemporal images as input without the need of Digital Elevation Models (DEMs)
- Hyperspectral Change Detection Dataset Irrigated Agricultural Area
- CNN-RNN-Yield-Prediction -> soybean dataset
Kaggle hosts over > 200 satellite image datasets, search results here. The kaggle blog is an interesting read.
- https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data
- 3-5 meter resolution GeoTIFF images from planet Dove satellite constellation
- 12 classes including - cloudy, primary + waterway etc
- 1st place winner interview - used 11 custom CNN
- FastAI Multi-label image classification
- Multi-Label Classification of Satellite Photos of the Amazon Rainforest
- Understanding the Amazon Rainforest with Multi-Label Classification + VGG-19, Inceptionv3, AlexNet & Transfer Learning
- amazon-classifier -> compares random forest with CNN
- multilabel-classification -> compares various CNN architecutres
- Planet-Amazon-Kaggle -> uses fast.ai
- deforestation_deep_learning
- Track-Human-Footprint-in-Amazon-using-Deep-Learning
- Amazon-Rainforest-CNN -> uses a 3-layer CNN in Tensorflow
- rainforest-tagging -> Convolutional Neural Net and Recurrent Neural Net in Tensorflow for satellite images multi-label classification
- satellite-deforestation -> Using Satellite Imagery to Identify the Leading Indicators of Deforestation, applied to the Kaggle Challenge Understanding the Amazon from Space
- https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
- Rating - medium, many good examples (see the Discussion as well as kernels), but as this competition was run a couple of years ago many examples use python 2
- WorldView 3 - 45 satellite images covering 1km x 1km in both 3 (i.e. RGB) and 16-band (400nm - SWIR) images
- 10 Labelled classes include - Buildings, Road, Trees, Crops, Waterway, Vehicles
- Interview with 1st place winner who used segmentation networks - 40+ models, each tweaked for particular target (e.g. roads, trees)
- ZF_UNET_224_Pretrained_Model 2nd place solution ->
- 3rd place soluton -> which explored pansharpening & calculating reflectance indices, with arxiv paper
- Deepsense 4th place solution
- Entry by lopuhin using UNet with batch-normalization
- Multi-class semantic segmentation of satellite images using U-Net using DSTL dataset, tensorflow 1 & python 2.7. Accompanying article
- Deep-Satellite-Image-Segmentation
- Dstl-Satellite-Imagery-Feature-Detection-Improved
- Satellite-imagery-feature-detection
- Satellite_Image_Classification -> using XGBoost and ensemble classification methods
- Unet-for-Satellite
- building-segmentation -> TensorFlow U-Net implementation trained to segment buildings in satellite imagery
- https://www.kaggle.com/datasets/crawford/deepsat-sat4 & https://www.kaggle.com/datasets/crawford/deepsat-sat6
- DeepSat-Kaggle -> uses Julia
- deepsat-aws-emr-pyspark -> Using PySpark for Image Classification on Satellite Imagery of Agricultural Terrains
- https://www.kaggle.com/c/airbus-ship-detection/overview
- Rating - medium, most solutions using deep-learning, many kernels, good example kernel
- I believe there was a problem with this dataset, which led to many complaints that the competition was ruined
- Deep Learning for Ship Detection and Segmentation -> treated as instance segmentation problem, with notebook
- Lessons Learned from Kaggle’s Airbus Challenge
- Airbus-Ship-Detection -> This solution scored 139 out of 884 for the competition, combines ResNeXt50 based classifier and a U-net segmentation model
- Ship-Detection-Project -> uses Mask R-CNN and UNet model
- Airbus_SDC
- Airbus_SDC_dup -> Project focused on detecting duplicate regions of overlapping satellite imagery. Applied to Airbus ship detection dataset
- airbus-ship-detection -> CNN with REST API
- Ship-Detection-from-Satellite-Images-using-YOLOV4 -> uses Kaggle Airbus Ship Detection dataset
- Image Segmentation: Kaggle experience -> Medium article by gold medal winner Vlad Shmyhlo
- https://www.kaggle.com/rhammell/ships-in-satellite-imagery -> Classify ships in San Franciso Bay using Planet satellite imagery
- 4000 80x80 RGB images labeled with either a "ship" or "no-ship" classification, 3 meter pixel size
- shipsnet-detector -> Detect container ships in Planet imagery using machine learning
- https://www.kaggle.com/tomluther/ships-in-google-earth
- 794 jpegs showing various sized ships in satellite imagery, annotations in Pascal VOC format for object detection models
- kaggle-ships-in-Google-Earth-yolov5
- https://www.kaggle.com/datasets/rhammell/ships-in-satellite-imagery
- 4000 80x80 RGB images labeled with either a "ship" or "no-ship" classification, provided by Planet
- DeepLearningShipDetection
- Ship-Detection-Using-Satellite-Imagery
- https://www.kaggle.com/kbhartiya83/swimming-pool-and-car-detection
- 3750 satellite images of residential areas with annotation data for swimming pools and cars
- Object detection on Satellite Imagery using RetinaNet
- https://www.kaggle.com/rhammell/planesnet -> Detect aircraft in Planet satellite image chips
- 20x20 RGB images, the "plane" class includes 8000 images and the "no-plane" class includes 24000 images
- Dataset repo and planesnet-detector demonstrates a small CNN classifier on this dataset
- ergo-planes-detector -> An ergo based project that relies on a convolutional neural network to detect airplanes from satellite imagery, uses the PlanesNet dataset
- Using AWS SageMaker/PlanesNet to process Satellite Imagery
- Airplane-in-Planet-Image -> pytorch model
- https://www.kaggle.com/datasets/aceofspades914/cgi-planes-in-satellite-imagery-w-bboxes
- 500 computer generated satellite images of planes
- Faster RCNN to detect airplanes
- aircraft-detection-from-satellite-images-yolov3
- https://www.kaggle.com/c/draper-satellite-image-chronology/data
- Rating - hard. Not many useful kernels.
- Images are grouped into sets of five, each of which have the same setId. Each image in a set was taken on a different day (but not necessarily at the same time each day). The images for each set cover approximately the same area but are not exactly aligned.
- Kaggle interviews for entrants who used XGBOOST and a hybrid human/ML approach
- deep-cnn-sat-image-time-series -> uses LSTM
- https://www.kaggle.com/humansintheloop/semantic-segmentation-of-aerial-imagery
- 72 satellite images of Dubai, the UAE, and is segmented into 6 classes
- dubai-satellite-imagery-segmentation -> due to the small dataset, image augmentation was used
- U-Net for Semantic Segmentation on Unbalanced Aerial Imagery -> using the Dubai dataset
- Multiclass-semantic-segmentation-in-satallite-images -> uses keras
- Semantic-Segmentation-using-U-Net -> uses keras
- unet_satelite_image_segmentation
- https://www.kaggle.com/datasets/balraj98/massachusetts-roads-dataset
- https://www.kaggle.com/datasets/balraj98/massachusetts-buildings-dataset
- Official published dataset
- Road_seg_dataset -> subset of the roads dataset containing only 200 images and masks
- Road and Building Semantic Segmentation in Satellite Imagery uses U-Net on the Massachusetts Roads Dataset & keras
- Semantic-segmentation repo by fuweifu-vtoo -> uses pytorch and the Massachusetts Buildings & Roads Datasets
- ssai-cnn -> This is an implementation of Volodymyr Mnih's dissertation methods on his Massachusetts road & building dataset
- building-footprint-segmentation -> pip installable library to train building footprint segmentation on satellite and aerial imagery, applied to Massachusetts Buildings Dataset and Inria Aerial Image Labeling Dataset
- Road detection using semantic segmentation and albumentations for data augmention using the Massachusetts Roads Dataset, U-net & Keras
- Image-Segmentation) -> using Massachusetts Road dataset and fast.ai
Not satellite but airborne imagery. Each sample image is 28x28 pixels and consists of 4 bands - red, green, blue and near infrared. The training and test labels are one-hot encoded 1x6 vectors. Each image patch is size normalized to 28x28 pixels. Data in .mat
Matlab format. JPEG?
- Sat4 500,000 image patches covering four broad land cover classes - barren land, trees, grassland and a class that consists of all land cover classes other than the above three
- Sat6 405,000 image patches each of size 28x28 and covering 6 landcover classes - barren land, trees, grassland, roads, buildings and water bodies.
- https://www.kaggle.com/guofeng/hrsc2016
- Ship images harvested from Google Earth
- HRSC2016_SOTA -> Fair comparison of different algorithms on the HRSC2016 dataset
- https://www.kaggle.com/datasets/lilitopia/swimship-wake-imagery-mass
- An optical ship wake detection benchmark dataset built for deep learning
- WakeNet -> A CNN-based optical image ship wake detector, code for 2021 paper: Rethinking Automatic Ship Wake Detection: State-of-the-Art CNN-based Wake Detection via Optical Images
In this challenge, you will build a model to classify cloud organization patterns from satellite images.
- https://www.kaggle.com/c/understanding_cloud_organization/
- 3rd place solution on Github by naivelamb
- 15th place solution on Github by Soongja
- 69th place solution on Github by yukkyo
- 161st place solution on Github by michal-nahlik
- Solution by yurayli
- Solution by HazelMartindale uses 3 versions of U-net architecture
- Solution by khornlund
- Solution by Diyago
- Solution by tanishqgautam
- https://www.kaggle.com/datasets/sorour/38cloud-cloud-segmentation-in-satellite-images
- Contains 38 Landsat 8 images and manually extracted pixel-level ground truths
- 38-Cloud Github repository and follow up 95-Cloud dataset
- How to create a custom Dataset / Loader in PyTorch, from Scratch, for multi-band Satellite Images Dataset from Kaggle
- Cloud-Net: A semantic segmentation CNN for cloud detection -> an end-to-end cloud detection algorithm for Landsat 8 imagery, trained on 38-Cloud Training Set
- Segmentation of Clouds in Satellite Images Using Deep Learning -> semantic segmentation using a Unet on the Kaggle 38-Cloud dataset
- https://www.kaggle.com/airbusgeo/airbus-aircrafts-sample-dataset
- One hundred civilian airports and over 3000 annotated commercial aircrafts
- detecting-aircrafts-on-airbus-pleiades-imagery-with-yolov5
- pytorch-remote-sensing -> Aircraft detection using the 'Airbus Aircraft Detection' dataset and Faster-RCNN with ResNet-50 backbone in pytorch
- https://www.kaggle.com/airbusgeo/airbus-oil-storage-detection-dataset
- Oil-Storage Tank Instance Segmentation with Mask R-CNN with accompanying article
- Oil Storage Detection on Airbus Imagery with YOLOX -> uses the Kaggle Airbus Oil Storage Detection dataset
- Oil-Storage-Tanks-Data-Preparation-YOLO-Format
- https://www.kaggle.com/kmader/satellite-images-of-hurricane-damage
- https://github.com/dbuscombe-usgs/HurricaneHarvey_buildingdamage
- https://www.kaggle.com/franchenstein/austin-zoning-satellite-images
- classify a images of Austin into one of its zones, such as residential, industrial, etc. 3667 satellite images
Classify the target in a SAR image chip as either a ship or an iceberg. The dataset for the competition included 5000 images extracted from multichannel SAR data collected by the Sentinel-1 satellite. Top entries used ensembles to boost prediction accuracy from about 92% to 97%.
- https://www.kaggle.com/c/statoil-iceberg-classifier-challenge/data
- An interview with David Austin: 1st place winner
- Deep Learning for Iceberg detection in Satellite Images
- radar-image-recognition
- Iceberg-Classification-Using-Deep-Learning -> uses keras
- Deep-Learning-Project -> uses keras
- iceberg-classifier-challenge solution by ShehabSunny -> uses keras
- Analyzing Satellite Radar Imagery with Deep Learning -> by Matlab, uses ensemble with greedy search
- 16th place solution
- fastai solution
- https://www.kaggle.com/balraj98/deepglobe-land-cover-classification-dataset
- Satellite Imagery Semantic Segmentation with CNN -> 7 different segmentation classes, DeepGlobe Land Cover Classification Challenge dataset, with repo
- Land Cover Classification with U-Net -> Satellite Image Multi-Class Semantic Segmentation Task with PyTorch Implementation of U-Net, uses DeepGlobe Land Cover Segmentation dataset, with code
- DeepGlobe Land Cover Classification Challenge solution
A Data Set to Predict Wildfire Spreading from Remote-Sensing Data
Inspired by the above dataset, using different data sources
- https://www.kaggle.com/satellitevu/satellite-next-day-wildfire-spread
- https://github.com/SatelliteVu/SatelliteVu-AWS-Disaster-Response-Hackathon
- https://www.kaggle.com/datasets/amerii/spacenet-7-multitemporal-urban-development
- SatFootprint -> building segmentation on the Spacenet 7 dataset
- https://www.kaggle.com/datasets/sandeshbhat/satellite-images-to-predict-povertyafrica
- Uses satellite imagery and nightlights data to predict poverty levels at a local level
- Predicting-Poverty -> Combining satellite imagery and machine learning to predict poverty, in PyTorch
- https://www.kaggle.com/competitions/noaa-fisheries-steller-sea-lion-population-count -> count sea lions from aerial images
- Sealion-counting
- Sealion_Detection_Classification
- https://www.kaggle.com/datasets/alexandersylvester/arctic-sea-ice-image-masking
- sea_ice_remote_sensing
- A Benchmark Satellite Dataset as Drop-In Replacement for MNIST
- https://www.kaggle.com/datamunge/overheadmnist -> kaggle
- https://arxiv.org/abs/2102.04266 -> paper
- https://github.com/reveondivad/ov-mnist -> github
- https://www.kaggle.com/datasets/mahmoudreda55/satellite-image-classification
- satellite-image-classification-pytorch
- https://www.kaggle.com/datasets/raoofnaushad/eurosat-sentinel2-dataset
- RGB Land Cover and Land Use Classification using Sentinel-2 Satellite
- Used in paper Image Augmentation for Satellite Images
- https://www.kaggle.com/datasets/franciscoescobar/satellite-images-of-water-bodies
- pytorch-waterbody-segmentation -> UNET model trained on the Satellite Images of Water Bodies dataset from Kaggle. The model is deployed on Hugging Face Spaces
- https://www.kaggle.com/c/noaa-fisheries-steller-sea-lion-population-count
- noaa -> UNET, object detection and image level regression approaches
- https://www.kaggle.com/reubencpereira/spatial-data-repo -> Satellite + loan data
- https://www.kaggle.com/towardsentropy/oil-storage-tanks -> Image data of industrial oil tanks with bounding box annotations, estimate tank fill % from shadows
- https://www.kaggle.com/airbusgeo/airbus-wind-turbines-patches -> Airbus SPOT satellites images over wind turbines for classification
- https://www.kaggle.com/aceofspades914/cgi-planes-in-satellite-imagery-w-bboxes -> CGI planes object detection dataset
- https://www.kaggle.com/atilol/aerialimageryforroofsegmentation -> Aerial Imagery for Roof Segmentation
- https://www.kaggle.com/andrewmvd/ship-detection -> 621 images of boats and ships
- https://www.kaggle.com/alpereniek/vehicle-detection-from-satellite-images-data-set
- https://www.kaggle.com/sergiishchus/maxar-satellite-data -> Example Maxar data at 15 cm resolution
- https://www.kaggle.com/cici118/swimming-pool-detection-algarves-landscape
- https://www.kaggle.com/datasets/donkroco/solar-panel-module -> object detection for solar panels
- https://www.kaggle.com/datasets/balraj98/deepglobe-road-extraction-dataset -> segment roads
Competitions are an excellent source for accessing clean, ready-to-use satellite datasets and model benchmarks.
- https://codalab.lisn.upsaclay.fr/competitions/9603 -> object detection from diversified satellite imagery
- https://www.drivendata.org/competitions/143/tick-tick-bloom/ -> detect and classify algal bloom
- https://www.drivendata.org/competitions/81/detect-flood-water/ -> map floodwater from radar imagery
- https://platform.ai4eo.eu/enhanced-sentinel2-agriculture -> map cultivated land using Sentinel imagery
- https://www.diu.mil/ai-xview-challenge -> multiple challenges ranging from detecting fishing vessals to estimating building damages
- https://competitions.codalab.org/competitions/30440 -> flood detection
- https://www.drivendata.org/competitions/83/cloud-cover/ -> cloud cover detection
- https://www.drivendata.org/competitions/78/overhead-geopose-challenge/page/372/ -> predicts geocentric pose from single-view oblique satellite images
- https://www.drivendata.org/competitions/60/building-segmentation-disaster-resilience/ -> building segmentation
- https://captain-whu.github.io/DOTA/ -> large dataset for object detection in aerial imagery
- https://spacenet.ai/ -> set of 8 challenges such as road network detection
- https://huggingface.co/spaces/competitions/ChaBuD-ECML-PKDD2023 -> binary image segmentation task on forest fires monitored over California
- https://spaceml.org/repo/project/6269285b14d764000d798fde -> ML for floods
- https://spaceml.org/repo/project/60002402f5647f00129f7287 -> lightning and extreme weather
- https://spaceml.org/repo/project/6025107d79c197001219c481/true -> ~1TB dataset for precipitation forecasting
- https://spaceml.org/repo/project/61c0a1b9ff8868000dfb79e1/true -> Sentinel-2 image super-resolution