Unable to train the model

Hey,
It is a nice work and really appreciated.

I am getting an error when i run python model.py. May i ask you what could be the problem?
Thanks!

Please pull the newest version of the code! We have updated the code!

Thank you for the response,
Where i can find that?
Thanks!

I have fixed that issue,
Now I am getting another issue.
Can you please have a look at it?
Thanks!

Do you use your own videos?
This should be the expected output if you use my data.

==============================left_eye==============================
===============DefakeHop Prediction===============
===============MultiChannelWiseSaab Transformation===============
Hop1
Input shape: (2708, 32, 32, 3)
Output shape: (2708, 15, 15, 13)

Hi,
Thanks for your kind response.
Your output looks like:

The final result was like:

I used CelebDF-V1 dataset.
I preprocessed the dataset, and create .npz files as per the requirements of the model.
But I got this error.

Sorry for the late reply!
Here is the data for Celeb-DF-v1! I follow the same code and get this data!
And please update the model.py, I think there was a small error due to the change of the structure of this repo!
https://drive.google.com/drive/folders/1nEBe5wGPmm2G1NsR46NK8msHiCUE9f8K?usp=sharing

Please let me know whether you can get the results or not! Thank you!

Thank you so much for your kind response!

Well, I already modified the mpdel.py, thanks.
But I got the same error even on your data.
Please have a look at it.

Thanks!

Could you help me check this part of code in model.py?
test_images should be a 4D numpy array!

    for region in model.regions:
        path = 'data/' + region + '.test.npz'
        data = np.load(path)
        test_labels = data['labels']
        test_images = data['images']
        test_names = data['names']
        model.predict_region(region, test_images, test_names)

Thank you for the response..!

Sure, you may have a look at it.
Thanks!

Line169 and 170 are wrong
And please load the testing data by np.load since we made some changes for the structure of this repo

Thank You so much..!!
Finally solved the issue.

But still have to run some other datasets and I am unable to do that due to less memory of the system.
Anyway thank you for your time and help.
Will contact you if faced any other issue.

Thanks!

Hey,I extract feature data(celeb-v2) with shape (761334,32,32,3), it will out of memory when i run model.py
So how do you run the large dataset?
Thanks!

Hey,I extract feature data(celeb-v2) with shape (761334,32,32,3), it will out of memory when i run model.py So how do you run the large dataset? Thanks!

Hey,
I am having the same issue and I asked one of the author of the paper and she said I can divide the data in chunks and train the model. But I am trying to arrange the resources for me, as I believe that it will effect the performance of the model and our results wouldn't be comparable to this work or others due to different circumstances.
Thanks!

Hi! Thank you for your questions! It is a really great question! I will update the code for this problem! The idea is that we subsample the dataset and we use subset to train the Saab transform. For prediction, the whole dataset is used by dividing the dataset to many chunks and predict each chunks one by one! For Saab transform, we could use a subset to get the kernels which is demonstrated that it will get similar kernel when the number of samples is large. For XGBoost, we still use all the samples to train! I will update the code as soon as possible!

Thank you for the response！
And looking forward to your update~

Solved! Please try the new saab.py!

Hi,
I tried to run it but this time the issue seems different.
Thanks!

Solved! Please try the new saab.py!

Did you update the saab.py? Could you show me the fit in saab.py?

Thank you!The new saab.py work.
But this time the issue is xgboost,it still need much memory.
$K1_D4{9 }~J8)U)~P($JN6Q$

[](

DefakeHop/defakeHop.py

Line 155 in 941efb6

    
           clf = XGBClassifier(max_depth=1, tree_method='gpu_hist', objective='binary:logistic', eval_metric='auc',

)
Change gpu_hist to hist

[](

DefakeHop/defakeHop.py

Line 155 in 941efb6

clf = XGBClassifier(max_depth=1, tree_method='gpu_hist', objective='binary:logistic', eval_metric='auc',

)
Change gpu_hist to hist

Thank you for your response! But now I hava the same issue as @wasim004
I updata saab.py and only modify file path.
It will be killed at right eye region with shape(134069,32,32,3)

DefakeHop/saab.py

Line 38 in 941efb6

def fit(self, images, max_images=10000, max_patches=1000000, seed=777):

Could you check which line in fit you program get killed?

DefakeHop/saab.py

Line 38 in 941efb6

def fit(self, images, max_images=10000, max_patches=1000000, seed=777):

Could you check which line in fit you program get killed?

Hi,
I updated saab.py but still getting same error.

Hey,i guess the issue is still about out of memory.I entered three shapes:

(100000,32,32,3): It work!
(134069,32,32,3): Killed.
(761334,32,32,3): MemoryError: Unable to allocate array with shape (761334,32,32,3).
The error in saab.py at output = np.zeros((N, H, W, n_channels), dtype="float64")

Please change all "float64" to "float32" in saab.py! And change the batch size from 50000 to 10000!

DefakeHop/saab.py

Line 111 in 941efb6

def transform(self, images, n_channels=-1, batch_size=50000):

Hi @hongshuochen,

I've finally managed to run and train the model and got the results.
I trained the model both on the original and modified code of saab.py for CelebDF-v2 dataset. I got the results below:

In your original paper the results are different than what I've got for the CelebDF-v2 dataset.
Do you've any idea what could be the possible reason?

Also I train the model on CelebDF-v1.
I have got the following results:

My Data: Frame(0.7971), Video(0.8453)
Yours Data: Frame(0.9138), Video(0.9363)

The reason for this is that you change from "float64" to "float32"
If you run with "float32" you can get the results that I get!

The reason for this is that you change from "float64" to "float32" If you run with "float32" you can get the results that I get!

Hi,
Thanks for the response.
I did not changed anything.
I run your code as it is, because I find out a server to run your code so I don't have to modify anything in the code.
Thanks!

Hi @wasim004 I think you only use 2 regions, right? Please use 3 regions!
I reclone the repo and download the data from https://drive.google.com/drive/u/1/folders/1nEBe5wGPmm2G1NsR46NK8msHiCUE9f8K
I just run the Celeb-DF-v1, this is the result I get!

Hi,
Still got the same results.

Can you check this line?
Your features shape should be (6065,540) instead of 360

DefakeHop/model.py

Line 153 in 941efb6

    
           model = Ensemble(regions=['left_eye', 'right_eye', 'mouth'], num_frames=6, verbose=True)

Hi,
The actual problem is that my input shape is 5190 and your is 75242.
Also the data I get after running data.py is 500 MB in size and your is 783MB.

So, I tried to re-extract the landmarks and patches and followed the same steps but again I got the same data size and shape.
My CelebDF-V1 dataset contains Test(32(Real)+159(Fake))+Train(126(Real)+639(Fake)) videos.
My CelebDF-V2 dataset contains Test(118(Real)+1128(Fake))+Train(472(Real)+4511(Fake)) videos.

Can you please confirm your datasets sizes?
Thanks!

Hi,
After including the mouth region the results on your data are now fine.

Thanks!