[ICAPR 2017] Image Hash Minimization for Tamper Detection 🔥

This is the official implementation of the paper Image Hash Minimization for Tamper Detection by S. Maity and R. K. Karsh published at ICAPR 2017.

📌 Requirements

📌 Guidelines to Use

📌 FAQ

📌 Citation

Methodological Flow

Sample Qualitative Depiction

Quantitative Performance Measures

	Pun et al.	Ours
Hash Length	634 digits	64 bits
Robustness against Noise & Compression	Yes	Yes
Detection Accuracy	60% Approximately	77%

🚀 Requirements

Mathworks MATLAB R2016b or later versions

📝 Guidelines to Use

✅ Dataset

✅ Running the Scripts

Dataset

To test the accuracy of our model, we have used CASIA 2.0 dataset which is no longer available from its official source. However, the official dataset as well as a correctly annotated version can be downloaded from here.
The dataset we curated having 200 tampered images with tampered area <5% is a private dataset and is unavailable for usage.
The dataset should be extracted to have the following structure.

├── dataset                      # Dataset root directory
   ├── CASIAv2                   # CASIAv2.0 dataset root directory
      ├── original               # Directory containing original images
      |  ├── 1.jpg
      |  ├── 2.jpg
      |  ├── ...
      |
      └── tampered               # Directory containing tampered images
         ├── (1).jpg
         ├── (2).jpg
         ├── ...

The images in the 'original' image directory have naming convention as <image_number>.jpg and the images in the 'Tampered' image directory have naming convention as <(image_number)>.jpg for the corresponding original and tampered image pairs. The image numbers should be consecutive without any breaks.

Running the Scripts

Open the codes directory in MATLAB.
Set the path and hyper-parameters in data_from_original.m and data_from_tampered.m. To imitate our process, ensure K=1 as we used single cluster to determine the deviation of the centroid. The hyper-parameter, the threshold thres for the strength of the SURF features detected in the images needs to be tuned according to the dataset. The proper CASIAv2.0 root path should be provided in the dataset_path and the count should be set as the total number of original and tampered image pairs.

count = 30;                                               % number of samples <n> in dataset
K = 1;                                                    % setting the number of clusters to be formed
thres = 1000;                                             % setting the threshold for SURF feature strength
dataset_path = 'path/to/dataset/root/CASIAv2/';           % setting the dataset path
maxiter_k = 1000000;                                      % setting up the maximum iterations for clustering

Run the data_from_original.m script and make sure that the centroids are saved as centers_original.mat in the codes directory. The script will provide a visualization of the SURF features extracted from each of the original images.
Run the data_from_tampered.m script and make sure that the centroids are saved as centers_tampered.mat in the codes directory. The script will provide a visualization of the SURF features extracted from each of the tampered images.
Set relevant parameters in tampered.m. The count should be set as the total number of original and tampered image pairs and K=1 for imitating the method described in the paper, same as data_from_original.m and data_from_tampered.m.

count = 30;                                               % number of samples <n> in dataset
K = 1;                                                    % setting the number of clusters to be formed

Run tampered.m script. The script will print out tampered or not-tampered status for each sample in the dataset and save the Euclidean distance matrix in a file named distance.mat where NaN represents the images that are not tampered.

🔍 FAQ

The k means clustering initial seed is chosen by the k means++ algorithm. It can also be chosen at random.
Different seeds either from the k means++ or random may result in minor deviation from the reported accuracy.
We recommend using the k means++ as it generates more stable seeds than the random strategy.

BibTeX

If you use our code for your research, please cite our paper. Many thanks!

@inproceedings{maity2017image,
title={Image Hash Minimization for Tamper Detection},
author={Maity, Subhajit and Karsh, Ram Kumar},
booktitle={Ninth International Conference on Advances in Pattern Recognition (ICAPR)},
year={2017}}

MaitySubhajit/ImageHashMinimization

[ICAPR 2017] Image Hash Minimization for Tamper Detection 🔥

Methodological Flow

Sample Qualitative Depiction

Quantitative Performance Measures

🚀 Requirements

📝 Guidelines to Use

Dataset

Running the Scripts

🔍 FAQ

BibTeX