auspicious3000/SpeechSplit

Obout downsampling implementation.

yangdongchao opened this issue · 1 comments

In your experiments, you choose downsampling frame number from 192 to 24. After that, you recover time resolution by repeat_interleave operation. So my question is if we donot use downsampling, the performance will decrease?
In your experiments, the upsampling operation is followwing downsampling operation. So it aims at discard some information?

downsampling is an information bottleneck