Cross validation fails with error during training
vineet-joshi opened this issue · 0 comments
Hello, this implementation does (should do) exactly what I need for a project I am working on.
However, I could not get the older versions of the torch+cuda
and numpy
modules to work on the the NVIDIA L4 GPU I am using for the project. I upgraded the torch
version to 1.13.1
and the GPU has CUDA 12.4
installed. I also had to upgrade numpy
version to 1.21.6
, without which I get the following error -
File "train.py", line 120, in main
_main(args)
File "train.py", line 114, in _main
run(args)
File "train.py", line 32, in run
from svoice.solver import Solver
File "/home/vineet/svoice/svoice/solver.py", line 23, in <module>
from .evaluate import evaluate
File "/home/vineet/svoice/svoice/evaluate.py", line 16, in <module>
from pesq import pesq
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/pesq/__init__.py", line 6, in <module>
from .cypesq import cypesq
File "pesq/cypesq.pyx", line 1, in init cypesq
ImportError: numpy.core.multiarray failed to import (auto-generated because you didn't call 'numpy.import_array()' after cimporting numpy; use '<void>numpy._import_array' to disable if you are certain you don't need it).
After updating these I was able to get the training script, train.py
to start without interpreter errors, but the script fails during the cross validation step/process with the following error
[2024-06-01 16:03:45,776][__main__][INFO] - For logs, checkpoints and samples check /home/vineet/svoice/outputs/exp_
[2024-06-01 16:03:56,183][__main__][INFO] - Running on host training-l4-2-vcpus-24-ram-96-ubuntu
[2024-06-01 16:03:58,471][svoice.solver][DEBUG] - Checkpoint will be saved to /home/vineet/svoice/outputs/debug/model.th
[2024-06-01 16:03:58,472][svoice.solver][INFO] - ----------------------------------------------------------------------
[2024-06-01 16:03:58,472][svoice.solver][INFO] - Training...
[2024-06-01 16:03:59,818][svoice.solver][INFO] - Train | Epoch 1 | 3/15 | 3.5 it/sec | Loss 21.13142
[2024-06-01 16:04:00,384][svoice.solver][INFO] - Train | Epoch 1 | 6/15 | 4.1 it/sec | Loss 21.46726
[2024-06-01 16:04:00,954][svoice.solver][INFO] - Train | Epoch 1 | 9/15 | 4.4 it/sec | Loss 21.30898
[2024-06-01 16:04:01,521][svoice.solver][INFO] - Train | Epoch 1 | 12/15 | 4.6 it/sec | Loss 21.40352
[2024-06-01 16:04:02,067][svoice.solver][INFO] - Train | Epoch 1 | 15/15 | 4.7 it/sec | Loss 21.39990
[2024-06-01 16:04:02,070][svoice.solver][INFO] - Train Summary | End of Epoch 1 | Time 3.60s | Train Loss 21.39990
[2024-06-01 16:04:02,070][svoice.solver][INFO] - ----------------------------------------------------------------------
[2024-06-01 16:04:02,070][svoice.solver][INFO] - Cross validation...
[2024-06-01 16:04:02,330][__main__][ERROR] - Some error happened
Traceback (most recent call last):
File "train.py", line 120, in main
_main(args)
File "train.py", line 114, in _main
run(args)
File "train.py", line 95, in run
solver.train()
File "/home/vineet/svoice/svoice/solver.py", line 133, in train
valid_loss = self._run_one_epoch(epoch, cross_valid=True)
File "/home/vineet/svoice/svoice/solver.py", line 213, in _run_one_epoch
estimate_source = self.dmodel(mixture)
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vineet/svoice/svoice/models/swave.py", line 256, in forward
mixture_w = self.encoder(mixture)
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vineet/svoice/svoice/models/swave.py", line 284, in forward
mixture_w = F.relu(self.conv(mixture))
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 313, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/vineet/svoice/.testing/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 310, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (0). Kernel size: (8). Kernel size can't be greater than actual input size
After doing some searching, it appears that this could be a function of the training input .wav files. However, I am trying to use the training dataset provided with the repo, so would have thought that would be something that worked out of the box.
If I skip the cross validation step by setting the cross_valid
parameter to False
in the solver.py
script, the training progresses but I encounter errors in the SWave model's Encoder's forward()
method wherein the Conv1d()
function fails. Also, I tried upgrading to Python 3.12, with corresponding updates to the dependencies, but run into the same issues.
When I skip steps, such as cross validation or get around the Conv1d()
issues by providing default or empty tensors, I was able to get the training and evaluation to run, but the output speaker files have a monotone, continuous beeping sound overlayed on the speaker's voice, which I assume is a result of not performing cross validation or the convolution functions().
Any help in this regard is much appreciated. If I can get this implementation working, it is an ideal fit for a social project I am working on. Please let me know if you need additional information. Thanks.