-
official website: https://ai.facebook.com/datasets/dfdc/
-
kaggle: https://www.kaggle.com/c/deepfake-detection-challenge/overview
-
an overall of the competition and top 5 solutions: https://www.facebook.com/mediaforensics2020/videos/289928888867311/
Rank | Team | Score | code |
---|---|---|---|
1 | Selim Seferbekov | 0.42798 | code; link |
2 | \WM/ | 0.42842 | code |
3 | NtechLab | 0.43452 | code; link |
4 | Eighteen years old | 0.43476 | code |
5 | The Medics | 0.43711 | code; link |
6 | Konstantin Simonchik | 0.44289 | |
7 | All Faces Are Real | 0.445315 | |
8 | ID R&D | 0.44837 | |
9 | 名侦探柯西 | 0.44911 | |
10 | deeeepface | 0.45149 |
code: https://github.com/selimsef/dfdc_deepfake_challenge
kaggle discussion: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/145721
- detector: MTCNN. (no S3FD opensource license)
- augmentation: 1. Albumentations , **2. Cutout **
- model: EfficientNet B7
- Averaging predictions:
import numpy as np
def confident_strategy(pred, t=0.8):
pred = np.array(pred)
sz = len(pred)
fakes = np.count_nonzero(pred > t)
# 11 frames are detected as fakes with high probability
if fakes > sz // 2.5 and fakes > 11:
return np.mean(pred[pred > t])
elif np.count_nonzero(pred < 0.2) > 0.9 * sz:
return np.mean(pred[pred < 0.2])
else:
return np.mean(pred)
code: https://github.com/cuihaoleo/kaggle-dfdc
- face extractor: RetinaFace
- model: WS-GAN with Xception and WS-GAN efficient-net
- flow
code: https://github.com/NTech-Lab/deepfake-detection-challenge
kaggle discussion: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/158158
-
face extractor: WIDERFace_DSFD
-
model: three EfficientNet-B7, 2 of which are frame-by-frame while the other is squence-based.
-
augmentation:
-
external data:
File Name | Source | Direct Link | Forum Post |
---|---|---|---|
WIDERFace_DSFD_RES152.pth | github | google drive | link |
noisy_student_efficientnet-b7.tar.gz | github | link | link |
code: https://github.com/Siyu-C/RobustForensics
- face extractor:
RetinaFace
- model: 7 image-based models and 4 video-based models.
- image based: resnet34, xception, efficientnet
- video based: slowfast
code: https://github.com/jphdotam/DFDC/
-
face extractor: MTCNN
-
model:
- 3D CNN:
- arch: 7 different 3D CNNs across 4 different architectures (I3D, 3D ResNet34, MC3 & R2+1D) and 2 different resolutions (224 x 224 & 112 x 112).
- aug: 3D cutmix approach
- One 2D CNN
- arch: SE-ResNeXT50
- aug: mpeg compress.
- 3D CNN:
-
pipeline:
Kaggle discussion: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/157983
- face extractor: CenterNet
- The travel of a single model
model | score |
---|---|
IR-SE-50 | 0.51639 |
+ deeper arch (IR-SE-152) | 0.436 |
+ remove noisy cases with mask filtering | 0.40 |
+ BoundingBox X 1.3 -> resize to 291x291-> random cropping to 224x224. | 0.376 |
+ change to RA-92 | 0.362 |
+ select part 0-9 as valid | 0.342 |
+ face tracking / More faces per video | 0.331 |
+ additional training data from part 5~9 and FF++ / Random erasing | 0.323 |
+ sphere and water augmentation from DALI | 0.313 |
- using model ensembling