Combining EfficientNet and Vision Transformers

Question

Combining EfficientNet and Vision Transformers

TienLort opened this issue 2 years ago · 1 comments

Hello, I'm a university student, I'm currently researching the topic of deepfake detection. I also want to follow the direction of Combining-EfficientNet-and-Vision-Transformers like you but it seems it's quite difficult for you to detect. develop a simpler project than this one, can you give me a reference? thank you

Answer 1 · 2023-04-25T11:43:23.000Z

Hi. I am not sure I got you question. If you are looking for a simpler deepfake detection approach, I can send you this one: https://github.com/selimsef/dfdc_deepfake_challenge
This is the approach on which mine is based, anyway, almost all deepfake detection methods have a face detection/extraction step and then some other more sophisticated and customized parts like this and our other work (https://github.com/davide-coccomini/MINTIME-Multi-Identity-size-iNvariant-TIMEsformer-for-Video-Deepfake-Detection). So you will always find a similar preprocessing part in almost all works.

If you are facing some specific problems with our approach feel free to ask anything and I'll try to help!