How to train your own model and apply it? I have come so far but having problem at solver.py
FurkanGozukara opened this issue ยท 8 comments
Ok I have downloaded visual studio code to debug and understand
I see that make_spect_f0.py
is used to generate raptf0
and spmel
folders with values
So this make_spect_f0 reads a folder and decides whether it is male voice or female voice from spk2gen.pkl file
So as a beginning I have deleted all folders raptf0
and spmel
and wavs
then composed a wavs folder and composed another folder inside wavs as p285 which is a male assigned folder
Then inside p285 I have put my more than 2 hours long wav file myfile.wav
Question 1 : Does it have to be 16k hz and mono? or We can use maximum quality?
After I run make_spect_0.py
, it has composed myfile.npy
and myfile.npy
in raptf0
and spmel
folders
Then I did run make_metadata.py and it has composed train.pkl
inside spmel
Then when I run main.py
I get this below error at solver.py
I want to train a model. I don't want test.
Then I want to use this model to convert style of a speech to the trained model
So I need help thank you
Here the console output of the run main.py
PS C:\SpeechSplit> c:; cd 'c:\SpeechSplit'; & 'C:/Python37/python.exe' 'c:\Users\King\.vscode\extensions\ms-python.python-2020.12.424452561\pythonFiles\lib\python\debugpy\launcher' '55577' '--' 'c:\SpeechSplit\main.py'
Namespace(beta1=0.9, beta2=0.999, device_id=0, g_lr=0.0001, log_dir='run/logs', log_step=10, model_save_dir='run/models', model_save_step=1000, num_iters=1000000, resume_iters=None, sample_dir='run/samples', sample_step=1000, use_tensorboard=False)
Hyperparameters:
freq: 8
dim_neck: 8
freq_2: 8
dim_neck_2: 1
freq_3: 8
dim_neck_3: 32
dim_enc: 512
dim_enc_2: 128
dim_enc_3: 256
dim_freq: 80
dim_spk_emb: 82
dim_f0: 257
dim_dec: 512
len_raw: 128
chs_grp: 16
min_len_seg: 19
max_len_seg: 32
min_len_seq: 64
max_len_seq: 128
max_len_pad: 192
root_dir: assets/spmel
feat_dir: assets/raptf0
batch_size: 16
mode: train
shuffle: True
num_workers: 0
samplier: 8
Finished loading train dataset...
Generator_3(
(encoder_1): Encoder_7(
(convolutions_1): ModuleList(
(0): Sequential(
(0): ConvNorm(
(conv): Conv1d(80, 512, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
(1): Sequential(
(0): ConvNorm(
(conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
(2): Sequential(
(0): ConvNorm(
(conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
)
(lstm_1): LSTM(512, 8, num_layers=2, batch_first=True, bidirectional=True)
(convolutions_2): ModuleList(
(0): Sequential(
(0): ConvNorm(
(conv): Conv1d(257, 256, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(16, 256, eps=1e-05, affine=True)
)
(1): Sequential(
(0): ConvNorm(
(conv): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(16, 256, eps=1e-05, affine=True)
)
(2): Sequential(
(0): ConvNorm(
(conv): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(16, 256, eps=1e-05, affine=True)
)
)
(lstm_2): LSTM(256, 32, batch_first=True, bidirectional=True)
(interp): InterpLnr()
)
(encoder_2): Encoder_t(
(convolutions): ModuleList(
(0): Sequential(
(0): ConvNorm(
(conv): Conv1d(80, 128, kernel_size=(5,), stride=(1,), padding=(2,))
)
(1): GroupNorm(8, 128, eps=1e-05, affine=True)
)
)
(lstm): LSTM(128, 1, batch_first=True, bidirectional=True)
)
(decoder): Decoder_3(
(lstm): LSTM(164, 512, num_layers=3, batch_first=True, bidirectional=True)
(linear_projection): LinearNorm(
(linear_layer): Linear(in_features=1024, out_features=80, bias=True)
)
)
)
G
The number of parameters: 19437800
Current learning rates, g_lr: 0.0001.
Start training...
We've got an error while stopping in unhandled exception: <class 'StopIteration'>.
Traceback (most recent call last):
File "c:\Users\King\.vscode\extensions\ms-python.python-2020.12.424452561\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd.py", line 1994, in do_stop_on_unhandled_exception
self.do_wait_suspend(thread, frame, 'exception', arg, EXCEPTION_TYPE_UNHANDLED)
File "c:\Users\King\.vscode\extensions\ms-python.python-2020.12.424452561\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd.py", line 1855, in do_wait_suspend
keep_suspended = self._do_wait_suspend(thread, frame, event, arg, suspend_type, from_this_thread, frames_tracker)
File "c:\Users\King\.vscode\extensions\ms-python.python-2020.12.424452561\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd.py", line 1890, in _do_wait_suspend
time.sleep(0.01)
Man I will break my teeth if only 1 time such open source project works when instructions are followed
There are some that work, some even provide environment files and require very little effort. It's free cutting edge technology, so I can't complain ๐
Would be really cool to be able to make this one work.
Hi,
I don't know if this will help but I thought to mention that you should also check the make_metadat.py file as well (if you haven't already) because as it is, it's hardcoded - maybe that'll help debug the error.
I'm able to train with the test folder provided and I haven't tried training with custom data yet. I'll be sure to come back and update you if I run into the same error when I do.
@tejuafonja yes I have seen it. It defines sound file is male or female. I have given same name for male one. I am still getting error though.
I have uploaded my test here so you can check : https://github.com/FurkanGozukara/SpeechSplitTest
I will delete the repository once I can run it
Thank you very much
Hi @FurkanGozukara,
Propably a bit late but maybe for anyone out there stumbling on this problem here's a fix.
The main problem you have is 2hours long wav file.
In make_spect_f0 it reads the file and computes the spectogram and the f0.
This will however only generate one training file.
The error you're getting (stopIteration) is exactly because of that.
When you try to run the code it will fetch your data (only one in your case):
line 113 in solver.py:
data_iter = iter(data_loader)
And set the iterator on the first value
Further down line 141-145 you see this:
try:
x_real_org, emb_org, f0_org, len_org = next(data_iter)
except:
data_iter = iter(data_loader)
x_real_org, emb_org, f0_org, len_org = next(data_iter)
Here we try to get the next iterator, which isn't possible (since there's only 1 file for training).
We catch the exception, load the data again and do exactly the same.
Normally (with more than 1 training file) this would solve the problem since we start at the beginning of our training data. But with only 1 file for training it will throw an stopIteration again.
So to solve this, just use more than 1 file. You can for example cut your 2hours long wav file in pieces and put them all in the p285 map (it's important that the same voices goes in the same folder).
@yenebeb so basically if i duplicate my training file it should work
i will test ty