/Multimodal-FFM-TLD

Attention-based Multimodal Image Feature Fusion Module for Transmission Line Detection

Primary LanguagePythonMIT LicenseMIT

Multimodal-FFM-TLD

This repository provides a PyTorch implementation of "Attention-based Multimodal Image Feature Fusion Module for Transmission Line Detection", which is accepted by IEEE Transactions on Industrial Informatics.

If you use this code, please cite the paper.

@article{choi2022attention,
  title={Attention-based Multimodal Image Feature Fusion Module for Transmission Line Detection},
  author={Choi, Hyeyeon and Yun, Jong Pil and Kim, Bum Jun and Jang, Hyeonah and Kim, Sang Woo},
  journal={IEEE Transactions on Industrial Informatics},
  year={2022},
  publisher={IEEE}
}

Overall process of data collection and model inference:

Data set

We constructed the Visible Light and Infrared Transmission Line Datset (VITLD). The dataset is available at https://bit.ly/3FBYjBY.

Pre-trained Models

UNet [1] with Early Fusion (EF) method [2] of our paper:

UNet [1] with proposed feature fusion module (FFM):

Reference

[1] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.

[2] Choi, Hyeyeon, et al. "Real-time power line detection network using visible light and infrared images." International Conference on Image and Vision Computing New Zealand (IVCNZ). IEEE, 2019.