This repository is an official implementation of the ICASSP 2024 paper "Harnessing the Power of Large Vision Language Models for Synthetic Image Detection".
☀️ If you find this work useful for your research, please kindly star our repo and cite our paper! ☀️
pip install -r requirements.txt
We use the codes of detection methods provided in the corresponding paper.
- AntifakePrompt
- DE-FAKE
- UniversalFakeDetect
- CNNDetection
- DIRE
- FusingGlobalandLocal
- ClipBased-SyntheticImageDetection
- DMimageDetection
- Diffusers
This step can be skipped, and you can directly test the model in the following section with a pre-trained model.
To train your own model:
python blip2_detect.py --dataset ./data/train.csv --epochs 20 --lr 5e-5
To run the evaluation, use the following command:
python blip2_test.py --model_path ./SaveFineTune --dataset ./data/test.csv
After training for 20 epochs, you will obtain accuracy and F1-score scores close to the percentages below:
{'LDM' : 99.12/99.13, 'ADM' : 85.24/82.97, 'DDPM' : 98.47/98.47, 'IDDPM' : 97.02/96.97, 'PNDM' : 99.22/99.23, 'SD v1.4' 77.68/71.79: , 'GLIDE' : 97.09/97.05}
The dataset used in this project is sourced from the work of Towards the Detection of Diffusion Model Deepfakes, available at Link to Original Dataset Repository.
if you make use of our work, please cite our papers
@article{keita2024harnessing,
title={Harnessing the Power of Large Vision Language Models for Synthetic Image Detection},
author={Keita, Mamadou and Hamidouche, Wassim and Bougueffa, Hassen and Hadid, Abdenour and Taleb-Ahmed, Abdelmalik},
journal={arXiv preprint arXiv:2404.02726},
year={2024}
}
@article{keita2024bi,
title={Bi-LORA: A Vision-Language Approach for Synthetic Image Detection},
author={Keita, Mamadou and Hamidouche, Wassim and Eutamene, Hessen Bougueffa and Hadid, Abdenour and Taleb-Ahmed, Abdelmalik},
journal={arXiv preprint arXiv:2404.01959},
year={2024}
}