How would you train for BW extension?

I'm interested in training to convert 24 kHz mel spectrograms to 48 kHz waveforms (like HIFI-GAN2). Might not work without changing the architecture, but that's ok. How would you modify the config files to do this? I've already run the recipe through stage 1 to extract features with downsampled VCTK. Now I'm hesitating on how to modify the generator parameters to produce 2x length waveform with the HIFI gan config

You can simply increase upsample scale here.

ParallelWaveGAN/egs/ljspeech/voc1/conf/hifigan.v1.yaml

Lines 38 to 39 in ffaa99f

    
           upsample_scales: [8, 8, 2, 2]         # Upsampling scales. 
        
           upsample_kernel_sizes: [16, 16, 4, 4] # Kernel size for upsampling layers.

E.g.,

 upsample_scales: [8, 8, 4, 2]         # Upsampling scales. 
 upsample_kernel_sizes: [16, 16, 8, 4] # Kernel size for upsampling layers.

I see, thank you very much!

…

On Fri, Mar 3, 2023 at 7:17 PM Tomoki Hayashi ***@***.***> wrote: You can simply increase upsample scale here. https://github.com/kan-bayashi/ParallelWaveGAN/blob/ffaa99fe77d3b0703e5857177fd9b2ecc18cb0bd/egs/ljspeech/voc1/conf/hifigan.v1.yaml#L38-L39 E.g., upsample_scales: [8, 8, 4, 2] # Upsampling scales. upsample_kernel_sizes: [16, 16, 8, 4] # Kernel size for upsampling layers. — Reply to this email directly, view it on GitHub <#397 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANZ6JON2SSKSBVATQSKYDHTW2KQ27ANCNFSM6AAAAAAVMWYTAQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

	upsample_scales: [8, 8, 2, 2] # Upsampling scales.
	upsample_kernel_sizes: [16, 16, 4, 4] # Kernel size for upsampling layers.