slp-rl/aero

Model Comparison

psp0001060 opened this issue · 2 comments

I am very interested in your paper, thank you for sharing.

May I ask how different models are compared, such as how many epochs are appropriate for training nuwave2. Is the code of the comparison model (such as nuwave2) merged into the AERO code to run the comparison results?

Check out this repo:
https://github.com/ZFTurbo/Audio-separation-models-checker

It compares the SDR of the output from audio separation models, like aero and demucs, against the ground truths of the audio.

Hi,
I've reached out to the authors of nuwave2, and they told me that they trained nuwave2 for "1.2M ~ 1.5M steps."
So according to the size of the dataset and the chosen batch-size, I would need to train for that amount. If I recall correctly, since I use the same VCTK dataset as they did, it was sufficient for me to use the pre-trained checkpoints and I didn't need to train from scratch.

For the SeaNET model, I used a similar approach. In the paper, they mention: "In all our experiments, we train for 1 million steps with a batch size of 16 using the same optimizer parameters and ..". So according to the batch size and dataset chosen, I chose the number of epochs to be so that n_epochs*n_steps_per_epoch=1M steps total.

I did not incorporate the code of nuwave2 into my code, but rather used the code from their git.

Hope this helps,
The author