About Learning Rate and Training Data
Opened this issue · 2 comments
Hello,
Thanks for this nice work. I have some questions. Firstly when I used tensorboard to monitor training curves I realized that the learning rate didn't change. Why do you use constant learning rate instead of learning rate decay ? Is there any advantage to using constant learning rate ?
When I take a look your paper , I can't see any explanation about this. I am trainig the specrnet model.
Second question is about spoof and bonafide data. How much data or how many hours spoof and bonafide data do you actually use ?
Thanks for your time.
Hi,
Yes - we did not use any LR scheduling technique. In the experiments, we focused on the front-ends and the differences between them. This way, we showed that simple change of the front-end from algorithmic ones (like MFCC or LFCC) to the Whisper features can improve generalization.
The results can be enhanced further by using scheduling techniques, data augmentation (e.g. RawBoost) or a larger dataset (we wanted the training procedure to be completed in less than 24 hours, so we used only ~100k samples).
To improve the model's results I would use larger Whisper models and larger (more diverse) datasets.
Best,
Piotr
Thank you so much :) For each classes, how many hours data do you actually use ? @piotrkawa