/ArtificialGANFingerprints

The official PyTorch implementation for ICCV'21 Oral paper 'Artificial GAN Fingerprints: Rooting Deepfake Attribution in Training Data'

Primary LanguagePythonOtherNOASSERTION

Artificial GAN Fingerprints

Ning Yu*, Vladislav Skripniuk*, Sahar Abdelnabi, Mario Fritz
*Equal contribution
ICCV 2021 Oral

Abstract

Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs). Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation. While existing research work on deepfake detection demonstrates high accuracy, it is subject to advances in generation techniques and adversarial iterations on detection countermeasure techniques. Thus, we seek a proactive and sustainable solution on deepfake detection, that is agnostic to the evolution of generative models, by introducing artificial fingerprints into the models.

Our approach is simple and effective. We first embed artificial fingerprints into training data, then validate a surprising discovery on the transferability of such fingerprints from training data to generative models, which in turn appears in the generated deepfakes. Experiments show that our fingerprinting solution (1) holds for a variety of cutting-edge generative models, (2) leads to a negligible side effect on generation quality, (3) stays robust against image-level and model-level perturbations, (4) stays hard to be detected by adversaries, and (5) converts deepfake detection and attribution into trivial tasks and outperforms the recent state-of-the-art baselines. Our solution closes the responsibility loop between publishing pre-trained generative model inventions and their possible misuses, which makes it independent of the current arms race.

Prerequisites

  • Linux
  • NVIDIA GPU + CUDA 10.0 + CuDNN 7.5
  • Python 3.6
  • To install the other Python dependencies, run pip3 install -r requirements.txt

Datasets

Fingerprint autoencoder training

  • Run, e.g.,
    python3 train.py \
    --data_dir /path/to/images/ \
    --use_celeba_preprocessing \
    --image_resolution 128 \
    --output_dir /path/to/output/ \
    --fingerprint_length 100 \
    --batch_size 64
    
    where
    • use_celeba_preprocessing needs to be active if and only if using CelebA aligned and cropped images.
    • image_resolution indicates the image resolution for training. All the images in data_dir is center-cropped according to the shorter side and then resized to this resolution. When use_celeba_preprocessing is active, image_resolution has to be set as 128.
    • output_dir contains model snapshots, image snapshots, and log files. For model snapshots, *_encoder.pth and *_decoder.pth correspond to the fingerprint encoder and decoder respectively.

Pre-trained fingerprint autoencoder models

Fingerprint embedding and detection

  • For fingerprint embedding, run, e.g.,

    python3 embed_fingerprints.py \
    --encoder_path /path/to/encoder/ \
    --data_dir /path/to/images/ \
    --use_celeba_preprocessing \
    --image_resolution 128 \
    --output_dir /path/to/output/ \
    --identical_fingerprints \
    --batch_size 64
    

    where

    • use_celeba_preprocessing needs to be active if and only if using CelebA aligned and cropped images.
    • image_resolution indicates the image resolution for fingerprint embedding. All the images in data_dir is center-cropped according to the shorter side and then resized to this resolution. It should match the input resolution for the well-trained encoder read from encoder_path. When use_celeba_preprocessing is active, image_resolution has to be set as 128.
    • output_dir contains embedded fingerprint sequence for each image in embedded_fingerprints.txt and fingerprinted images in fingerprinted_images/.
    • identical_fingerprints needs to be active if and only if all the images need to be fingerprinted with the same fingerprint sequence.
  • For fingerprint detection, run, e.g.,

    python3 detect_fingerprints.py \
    --decoder_path /path/to/decoder/ \
    --data_dir /path/to/fingerprinted/images/ \
    --image_resolution 128 \
    --output_dir /path/to/output/ \
    --batch_size 64
    

    where

    • output_dir contains detected fingerprint sequence for each image in detected_fingerprints.txt.
    • image_resolution indicates the image resolution for fingerprint detection. All the images in data_dir is center-cropped according to the shorter side and then resized to this resolution. It should match the input resolution for the well-trained decoder read from decoder_path.

Generative models trained on fingerprinted datasets

Citation

@inproceedings{yu2021artificial,
  author={Yu, Ning and Skripniuk, Vladislav and Abdelnabi, Sahar and Fritz, Mario},
  title={Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Acknowledgement