- python: 3.9
- pytorch: 2.0
- mmpretrain: 1.0.0rc8
conda create -n uvif python=3.9
conda activate uvif
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.7 -c pytorch -c nvidia -y
pip install -U openmim
cd UVIF
mim install -e .
We preform experiments on ForgeryNet and DFDC (preview).
We use RetinaFace to detect face bounding boxes from each video clip or static image. The processed json annotations with face regions are available at this link.
The dataset directory structure is like this:
data/
├── ForgeryNet/
│ ├── annotations/
│ │ ├── video_train.json
│ │ └── ...
│ ├── Training/
│ │ ├── video
│ │ ├── image
│ │ └── ...
│ └── Validation/
│ ├── video
│ └── ...
└── DFDCP/
├── annotations/
│ ├── train.json
│ └── ...
├── method_A
├── method_B
└── original_videos
Taking ForgeryNet as an example:
# baseline
bash tools/dist_train.sh configs_uvif/forgerynet/video_r50_forgerynet.py 2
# uvif
bash tools/dist_train.sh configs_uvif/forgerynet/uvif_r50_forgerynet.py 2
# single gpu
python test.py configs_uvif/forgerynet/uvif_r50_forgerynet.py pretrained/uvif_r50_forgerynet.pth
# multiple gpu
bash tools/dist_test.sh configs_uvif/forgerynet/uvif_r50_forgerynet.py pretrained/uvif_r50_forgerynet.pth 2
The pretrained weights of the following models are available at this link.
method | config | mAcc | AUC |
---|---|---|---|
Baseline - Res50 | video_r50_forgerynet.py | 80.89 | 88.66 |
Baseline - Res101 | video_r101_forgerynet.py | 81.48 | 88.08 |
Baseline - ConvNeXt-T | video_convnext-t_forgerynet.py | 81.56 | 88.43 |
UVIF - Res50 | uvif_r50_forgerynet.py | 85.32 | 93.45 |
UVIF - Res101 | uvif_r101_forgerynet.py | 86.57 | 94.42 |
UVIF - ConvNeXt-T | uvif_convnext-t_forgerynet.py | 84.94 | 93.35 |
method | config | Acc | AUC |
---|---|---|---|
UVIF - Res50 | uvif_r50_dfdcp.py | 83.40 | 93.54 |
UVIF - Res101 | uvif_r101_dfdcp.py | 87.00 | 94.95 |
The code is based on MMPretrain and MMAction2. Thanks for their contributions.