Waveglow as Inverse STFT function
ajinkyakulkarni14 opened this issue · 2 comments
Hello
Currently, I am trying to train waveglow model from scratch to implement the Inverse STFT function. I am using 20K samples of noise+speech to train the system. I am attaching the configuration of waveglow model I am training. After 420K iteration, I synthesized the audio waveform for given input as STFT. The obtained results have a whistling sound in it, Can anyone suggest to me, approx for how many iterations I should train the model ? and if 20K number of samples are sufficient to train the system? and any other guidelines to improve the model.
Thanks
{
"train_config": {
"fp16_run": false,
"output_directory": "checkpoints",
"epochs": 100000,
"learning_rate": 1e-4,
"sigma": 1.0,
"iters_per_checkpoint": 20000,
"batch_size": 1,
"seed": 1234,
"checkpoint_path": "",
"with_tensorboard": false
},
"data_config": {
"training_files": "train_list.txt",
"segment_length": 16000,
"sampling_rate": 16000,
"filter_length": 511,
"hop_length": 256,
"win_length": 511,
"mel_fmin": 0.0,
"mel_fmax": 8000.0
},
"dist_config": {
"dist_backend": "nccl",
"dist_url": "tcp://localhost:54321"
},
"waveglow_config": {
"n_mel_channels": 256,
"n_flows": 12,
"n_group": 8,
"n_early_every": 4,
"n_early_size": 2,
"WN_config": {
"n_layers": 8,
"n_channels": 256,
"kernel_size": 3
}
}
}
Can you share a couple audio samples and your loss curves for training and validation?
Please check the materials given below,
I observed that on iteration number 345484: the loss suddenly increased to 2059746816.000000000, which explains the spike. I also tried to plot loss for shorter range of iterations and observed that there are spikes in between.
Furthermore, I am attaching the code for training and dataloader for stft based model for your reference.
The main purpose of creating Inverse STFT based waveglow is to use it as a pretrained model to train it further in the context of speech enhancement.
Can you suggest what should I do to optimize well the model?
samples_orignal_and_waveglow.zip
loss_log_waveglow_istft.txt