/SCAMET_RSIC

This is tensorflow 2.2 based SCAMET framework for remote sensing image captioning.

Primary LanguagePython

SCAMET_RSIC

This is tensorflow 2.2 based repository of SCAMET framework for remote sensing image captioning. This is official implementation of Spatial-Channel Attention based Memory-guided Transformer (SCAMET) approach. We have designed encode-decoder based CNN-Transformer approach for describing the multi-spectral, multi-resolution, multi-directional remote sensing images.

Requirements

Qualitative Results

  • Qualitative analysis shows, proposed SCAMET produces more reliable captions for any kind of remote sensing images than baseline.

Attention Heatmap

  • Attention heatmap illustrates, the individual ability of spatial and channel wise attention encorporated with CNN for selecting pertinent objects in remote sensing images.

Citation

Our research work is published at "Engineering Appliations of Artificial Intelligence", International scientific journal of Elsevier.

Cite it as:

@article{gajbhiye2022generating,
title={Generating the captions for remote sensing images: A spatial-channel attention based memory-guided transformer approach},
author={Gajbhiye, Gaurav O and Nandedkar, Abhijeet V},
journal={Engineering Applications of Artificial Intelligence},
volume={114},
pages={105076},
year={2022},
publisher={Elsevier}
}