The SSL4EO-S12 dataset is a large-scale mutilmodal multitemporal dataset for unsupervised/self-supervised pre-training in Earth observation. The dataset consists of unlabeled patch triplets (Sentinel-1 dual-pol SAR, Sentinel-2 top-of-atmosphere multispectral, Sentinel-2 surface reflectance multispectral) from 251079 locations across the globe, each patch covering 2640mx2640m and including four seasonal time stamps.
- Full dataset: The full SSL4EO-S12 dataset (1.5TB, 500GB for each modality) is accessible at mediaTUM. There are some void IDs (gaps in folder names), see
data/void_ids.csv
. - Example subset: An example 100-patch subset (600MB) is available at Google Drive.
- RGB version: An RGB version of the full dataset is available here (link broken, we are working on it). The raw S2-L1C int16 values are normalized by mean and std and converted to uint8.
- A 50k (random) RGB subset (18GB) is available here (link broken, we are working on it). Sample IDs see
data/50k_ids_random.csv
.
The pre-trained models with different SSL methods are provided as follows (13 bands of S2-L1C, 100 epochs, input clip to [0,1]).
SSL method | Arch | BigEarthNet | EuroSAT | So2Sat-LCZ42 | Download | Usage | ||
---|---|---|---|---|---|---|---|---|
MoCo | ResNet50 | 91.8% | 99.1% | 60.9% | full ckpt | backbone | logs | define model, load weights |
MoCo | ViT-S/16 | 89.9% | 98.6% | 61.6% | full ckpt | backbone | logs | define model, load weights |
DINO | ResNet50 | 90.7% | 99.1% | 63.6% | full ckpt | backbone | logs | define model, load weights |
DINO | ViT-S/16 | 90.5% | 99.0% | 62.2% | full ckpt | backbone | logs | define model, load weights |
MAE | ViT-S/16 | 88.9% | 98.7% | 63.9% | full ckpt | backbone | logs | define model, load weights |
Data2vec | ViT-S/16 | 90.3% | 99.1% | 64.8% | full ckpt | backbone | logs | define model, load weights |
Other pre-trained models:
SSL method | Arch | Input | Download | ||
---|---|---|---|---|---|
MoCo | ResNet18 | S2-L1C 13 bands | full ckpt | backbone | logs |
ResNet18 | S2-L1C RGB | full ckpt, full ckpt ep200 | backbone | logs | |
ResNet50 | S2-L1C RGB | full ckpt | backbone | logs | |
ResNet50 | S1 SAR 2 bands | full ckpt | backbone | logs |
This repository is released under the Apache 2.0 license. The dataset and pretrained model weights are released under the CC-BY-4.0 license.
@article{wang2022ssl4eo,
title={SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation},
author={Wang, Yi and Braham, Nassim Ait Ali and Xiong, Zhitong and Liu, Chenying and Albrecht, Conrad M and Zhu, Xiao Xiang},
journal={arXiv preprint arXiv:2211.07044},
year={2022}
}