mostafaelaraby/wavegan-pytorch

THANK YOU!

Tylersuard opened this issue · 12 comments

I was trying to get the original Wavegan to run and tearing my hair out. Thank you for making it into a usable colab notebook, you saved me days of frustrated effort.

Ok I might have to take back that "thank you". I've been running the notebook for 8 hours now, and it's at 80%, but it's still generating empty .wav files.

Can confirm, I ran it all the way to 100%, only got empty wav files with a "click" noise somewhere in there. How can I fix this so it runs as expected?

Audio generation is extremely slow process. If you hear noise, generator starts generating. I trained speech enhancement gan for 2 weeks on Google cloud and it still a bit noisy.

The training stopped at 100%. It is no longer training. It generated hundreds of .wav files, all of them have no sound but a little click.

@Tylersuard sorry for the late reply, but can you tell me which dataset did you use and did u use the same parameters as in the notebook or not ?

Hi, I may also have the same issue here. I am 38% of the way through here, it's completed 37,000 steps, and every output directory has 10 files of silence. I've used your params, except have set the input directory to "drums" and window_length = 16384, as the files are shorter. I realise this isn't at all a detailed description of the problem, but it does look like out of the box your code doesn't run correctly on colab. However I would like to thank you for your effort - just like Tyler I have been confused how to implement WaveGAN on colab with Keras and from browsing your code this appears to be a very good port of it to PyTorch.
EDIT: This is the suspect line in utils.py: train_loader = WavDataLoader(os.path.join("piano", "train"), "wav"). That should be target_signals_dir instead.

@thenapking I will debug this problem within this week, and thank you for pointing out that suspected line

Thanks for looking into it. I don't think that's the culprit though. I wonder whether it's to do with the convolution itself and the sample length. I noticed when running on the drums folder, every 1,000 iterations the generated sound gets shorter until it is silent, somewhere around 7,000 iterations. I wonder if it is something to do either with upsampling using zeros or padding shorter files with silence?

I have updated the params which was the cause of the problems and cleaned some parts of the code, and here is the latent space interpolation after 15000 iterations
95260989-9ef36580-07f7-11eb-9de5-8b698185b940
@thenapking for the utils part is is just for testing when you run the utils.py directly, but in train.py it is using target_signals_dir
Some samples generated on 15000 iteration
samples.zip

Thanks, yes, I realised after I posted that edit the line in utils wasn't relevant... I was trying to figure why the code might be generating silence, but not erroring, and I wondered whether it could be something as simple as a directory mix-up, which it wasn't. However I have tried running your updated notebook from colab and it now gives the following error:

File "train.py", line 234, in <module> wave_gan.train() File "train.py", line 162, in train val_real.data, generated.data File "train.py", line 70, in calculate_discriminator_loss only_inputs=True, File "/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py", line 192, in grad inputs, allow_unused) RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

@thenapking it was a bug in the validate part for which the flag was false in my test notebook, i have pushed a fix.

@mostafaelaraby Thanks for your work debugging this. I can confirm that running it on the drums directory for about 20,000 steps now doesn't produce silence across the board - although there still are some files which are completely silent. I'll run it for 100,000 steps and share some results. It's also so much faster than the original package. It ran 20,000 steps in <1hr, which had taken >16hrs on colab on the original. Thanks again.