Knightzjz/NCL-IML

Generalization problems

Closed this issue · 2 comments

Hi, thanks very much for your awesome paper!

I tried some images with your playground. Not sure what was wrong, but the model failed to detect the manipulation clues.
By contrast, IML-ViT pre-trained model can locate some of the manipulation clues.

Below are the web images which I played with:

urls = [
  'https://upload.wikimedia.org/wikipedia/commons/7/79/Californian_sample_driver%27s_license%2C_c._2019.jpg',
  'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQHWaXHj5FkbLY2eetNuioY0sKqy927wjWClA&usqp=CAU',
  'https://www.1cutepooch.com/wp-content/uploads/2021/05/california-kid-driver-license-black-boy.jpg',
  'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQjzTV4YeuMvrumwJahkynEV77C5erLlCC8pQ&usqp=CAU',
  'https://platform.keesingtechnologies.com/wp-content/uploads/2022/03/shutterstock_758701297.jpg'
]

Thanks for your question. :d

Actually, the answer could be quite simple: IML-ViT is our latest benchmark, so it should perform better than NCL.

More detailed reasons you may be looking for are:

  1. We currently load the NCL model trained on NIST-16 dataset. Although NCL focuses on the training data insufficiency problem, but just as our Table 2 in the paper shows, NCL still greatly benefits from the enlargement of the training set. It is natural that the NIST-16 pre-trained model gains a performance gap on real-life images.
  2. We therefore proposed the MAE-based IML-ViT model. This latest model is better and could potentially be the new backbone of IML. Thanks for your attention on our IML-ViT. We are about to publish a new variant based on it, and even better generalization on real-life images will be endowed.

Thanks very much for your explaination!