xcmyz/FastVocoder

Shape mismatch error on new dataset

Opened this issue · 1 comments

Hi, thanks for your work!

The frame rate of my dataset is 22050, and hop size of text2mel model is 256. I have changed hparams.py accordingly, but training results in an expcetion: (preprocessing was fine, anyway)

  File "/home/user/speechlab/FastVocoder-main/model/loss/loss.py", line 23, in forward
    assert est_source_sub_band.size(1) == wav_sub_band.size(1)

I figured out that model inference still uses hop-size of 240. So how to make your code fully compatible with other datasets? it seems that the codes are somehow hardcoded for Biaobei dataset.

@tekinek Have you solved this, is the preprocessing step affect the training process which use difference sample rate:

image

And the weight which is generated from TasNet work in this case?

Can you share your hparam config for 22050Hz?