Positional encoding
Closed this issue · 3 comments
Hi Jan,
I think the function for position encoding is missing a bracket around (10.0**4). To be sure, I checked locally and the result varies without the bracket. Here is the current code snippet:
speedyspeech/code/functional.py
Lines 31 to 45 in 516eb1c
Can you please verify it and let me know? I shall fix it and open a PR. Thank you
Hi good catch, please use the develop branch where this issue is fixed, as discussed here: #23. Thanks! :)
Thanks Jan. I will use the develop branch. Let me close this issue here.
Hi Jan,
I have two question regarding this please:
- The positional encoding is weighted and I see that keys are weighted by a hyperparameter
w
equaling 6.42 whereas the queries are weight 1. Can you please explain how did you choose this value?speedyspeech/code/duration_extractor.py
Lines 261 to 263 in c4e5547
Lines 51 to 53 in c4e5547
- Given that I am using the new positional encoding function from develop branch, any suggestion on what weight can I choose please?
My best guess with my current understanding is w = num_spectrogram_frames / num_phonemes
(i.e.) average number of frames generated by a phoneme. But I am not sure.
Thank you very much