Maintenance Awesome GitHub watchers GitHub stars GitHub forks

Awesome Remote Sensing Foundation Models

🌟A collection of papers, datasets, code, and pre-trained weights for Remote Sensing Foundation Models (RSFMs).

🔥🔥🔥 Last Updated on 2024.01.02 🔥🔥🔥

Remote Sensing Vision Foundation Models

Abbreviation Title Publication Paper Code & Weights
GeoKR Geographical Knowledge-Driven Representation Learning for Remote Sensing Images TGRS2021 GeoKR link
- Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding CVPRW2021 Paper link
GASSL Geography-Aware Self-Supervised Learning ICCV2021 GASSL link
SeCo Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data ICCV2021 SeCo link
SatMAE SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery NeurIPS2022 SatMAE link
RS-BYOL Self-Supervised Learning for Invariant Representations From Multi-Spectral and SAR Images JSTARS2022 RS-BYOL null
GeCo Geographical Supervision Correction for Remote Sensing Representation Learning TGRS2022 GeCo null
RingMo RingMo: A remote sensing foundation model with masked image modeling TGRS2022 RingMo Code
RVSA Advancing plain vision transformer toward remote sensing foundation model TGRS2022 RVSA link
RSP An Empirical Study of Remote Sensing Pretraining TGRS2022 RSP link
MATTER Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks CVPR2022 MATTER null
CSPT Consecutive Pre-Training: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain RS2022 CSPT link
- Self-supervised Vision Transformers for Land-cover Segmentation and Classification CVPRW2022 Paper link
BFM A billion-scale foundation model for remote sensing images Arxiv2023 BFM null
TOV TOV: The original vision model for optical remote sensing image understanding via self-supervised learning JSTARS2023 TOV link
CMID CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding TGRS2023 CMID link
RingMo-Sense RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling TGRS2023 RingMo-Sense null
IaI-SimCLR Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery CVPRW2023 IaI-SimCLR null
CACo Change-Aware Sampling and Contrastive Learning for Satellite Images CVPR2023 CACo link
SatLas SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding ICCV2023 SatLas link
GFM Towards Geospatial Foundation Models via Continual Pretraining ICCV2023 GFM link
Scale-MAE Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning ICCV2023 Scale-MAE link
SpectralGPT SpectralGPT: Spectral Foundation Model Arxiv2023 SpectralGPT null
DINO-MC DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops Arxiv2023 DINO-MC link
CROMA CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders NeurIPS2023 CROMA link
Cross-Scale MAE Cross-Scale MAE: A Tale of Multiscale Exploitation in Remote Sensing NeurIPS2023 Cross-Scale MAE null
DeCUR DeCUR: decoupling common & unique representations for multimodal self-supervision Arxiv2023 DeCUR link
Presto Lightweight, Pre-trained Transformers for Remote Sensing Timeseries Arxiv2023 Presto link
CtxMIM CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding Arxiv2023 CtxMIM null
XGeo Multisensory Geospatial Models via Cross-Sensor Pretraining - XGeo null
FG-MAE Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing Arxiv2023 FG-MAE link
Prithiv Foundation Models for Generalist Geospatial Artificial Intelligence Arxiv2023 Prithiv link
RingMo-lite RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework Arxiv2023 RingMo-lite null
- A Self-Supervised Cross-Modal Remote Sensing Foundation Model with Multi-Domain Representation and Cross-Domain Fusion IGARSS2023 Paper null
EarthPT EarthPT: a foundation model for Earth Observation Arxiv2023 EarthPT null
USat USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery Arxiv2023 USat link
FoMo-Bench FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models Arxiv2023 FoMo-Bench Comming soon
AIEarth Analytical Insight of Earth: A Cloud-Platform of Intelligent Computing for Geospatial Big Data Arxiv2023 AIEarth link
SkySense SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery Arxiv2023 SkySense Comming soon

Remote Sensing Vision-Language Foundation Models

Abbreviation Title Publication Paper Code & Weights
RSGPT RSGPT: A Remote Sensing Vision Language Model and Benchmark Arxiv2023 RSGPT link
RemoteCLIP RemoteCLIP: A Vision Language Foundation Model for Remote Sensing Arxiv2023 RemoteCLIP link
GeoChat GeoChat: Grounded Large Vision-Language Model for Remote Sensing Arxiv2023 GeoChat link
GRAFT Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment ICLR2024 GRAFT null
- Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs Arxiv2023 Paper link

Remote Sensing Generative Foundation Models

Abbreviation Title Publication Paper Code & Weights
DiffusionSat DiffusionSat: A Generative Foundation Model for Satellite Imagery Arxiv2023 DiffusionSat null
Seg2Sat Seg2Sat - Segmentation to aerial view using pretrained diffuser models Github null link
- Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps NeurIPSW2023 Paper link

Remote Sensing Vision-Location Foundation Models

Abbreviation Title Publication Paper Code & Weights
CSP CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations ICML2023 CSP link
GeoCLIP GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization NeurIPS2023 GeoCLIP link
SatCLIP SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery Arxiv2023 SatCLIP Comming soon

Remote Sensing Vision-Audio Foundation Models

Abbreviation Title Publication Paper Code & Weights
- Self-supervised audiovisual representation learning for remote sensing data JAG2022 Paper link

(Large-scale) Pre-training Datasets

Abbreviation Title Publication Paper Attribute Link
fMoW Functional Map of the World CVPR2018 fMoW Vision link
SEN12MS SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion - SEN12MS Vision link
BEN-MM BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval GRSM2021 BEN-MM Vision link
MillionAID On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID JSTARS2021 MillionAID Vision link
SeCo Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data ICCV2021 SeCo Vision link
fMoW-S2 SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery NeurIPS2022 fMoW-S2 Vision link
TOV-RS-Balanced TOV: The original vision model for optical remote sensing image understanding via self-supervised learning JSTARS2023 TOV Vision link
SSL4EO-S12 SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation GRSM2023 SSL4EO-S12 Vision link
SSL4EO-L SSL4EO-L: Datasets and Foundation Models for Landsat Imagery Arxiv2023 SSL4EO-L Vision link
SatlasPretrain SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding ICCV2023 SatlasPretrain Vision (Supervised) link
CACo Change-Aware Sampling and Contrastive Learning for Satellite Images CVPR2023 CACo Vision Comming soon
RSVG RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data TGRS2023 RSVG Vision-Language link
RS5M RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model Arxiv2023 RS5M Vision-Language link
GEO-Bench GEO-Bench: Toward Foundation Models for Earth Monitoring Arxiv2023 GEO-Bench Vision (Evaluation) link
RSICap & RSIEval RSGPT: A Remote Sensing Vision Language Model and Benchmark Arxiv2023 RSGPT Vision-Language Comming soon
SkyScript SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing AAAI2024 SkyScript Vision-Language Comming soon

Survey Papers

Title Publication Paper Attribute
Self-Supervised Remote Sensing Feature Learning: Learning Paradigms, Challenges, and Future Works TGRS2023 Paper Vision & Vision-Language
Vision-Language Models in Remote Sensing: Current Progress and Future Trends Arxiv2023 Paper Vision-Language
The Potential of Visual ChatGPT For Remote Sensing Arxiv2023 Paper Vision-Language
遥感大模型:进展与前瞻 武汉大学学报 (信息科学版) 2023 Paper Vision & Vision-Language
地理人工智能样本:模型、质量与服务 武汉大学学报 (信息科学版) 2023 Paper -
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey JSTARS2023 Paper Vision & Vision-Language
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters Arxiv2023 Paper Vision
An Agenda for Multimodal Foundation Models for Earth Observation IGARSS2023 Paper Vision
Transfer learning in environmental remote sensing RSE2024 Paper Transfer learning
遥感基础模型发展综述与未来设想 遥感学报2023 Paper -
On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications Arxiv2023 Paper Vision-Language

Cite

If you find this repository useful, please consider giving a star ⭐ and citation:

@misc{guo2023skysense,
      title={SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery}, 
      author={Xin Guo and Jiangwei Lao and Bo Dang and Yingying Zhang and Lei Yu and Lixiang Ru and Liheng Zhong and Ziyuan Huang and Kang Wu and Dingxiang Hu and Huimei He and Jian Wang and Jingdong Chen and Ming Yang and Yongjun Zhang and Yansheng Li},
      year={2023},
      eprint={2312.10115},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}