piotrkawa/deepfake-whisper-features

About Learning Rate and Training Data

Opened this issue · 2 comments

Hello,

Thanks for this nice work. I have some questions. Firstly when I used tensorboard to monitor training curves I realized that the learning rate didn't change. Why do you use constant learning rate instead of learning rate decay ? Is there any advantage to using constant learning rate ?
When I take a look your paper , I can't see any explanation about this. I am trainig the specrnet model.

Second question is about spoof and bonafide data. How much data or how many hours spoof and bonafide data do you actually use ?

Thanks for your time.

Hi,
Yes - we did not use any LR scheduling technique. In the experiments, we focused on the front-ends and the differences between them. This way, we showed that simple change of the front-end from algorithmic ones (like MFCC or LFCC) to the Whisper features can improve generalization.

The results can be enhanced further by using scheduling techniques, data augmentation (e.g. RawBoost) or a larger dataset (we wanted the training procedure to be completed in less than 24 hours, so we used only ~100k samples).

To improve the model's results I would use larger Whisper models and larger (more diverse) datasets.

Best,
Piotr

Thank you so much :) For each classes, how many hours data do you actually use ? @piotrkawa