Train_network.py can't handle smaller mel spectrogram shapes
Closed this issue · 4 comments
Hi Dr. Hawley,
I noticed a small problem in the code in both train_network and eval_network- there is no error handling for files that produce spectrograms smaller than width 1293. This happens leads when the training data is created from the mel spectrograms (X_train[train_count,:,:] = melgram, around line 140).
You have written code to chop off the extra width if it is too long ( melgram = melgram[:,:,:,0:mel_dims[3]] ) but nothing to account for melgrams being too short.
I was able to get around it by filling the empty space with 0's, but I thought it would be helpful to let you know!
Also- if you are interested, I would love to connect with you sometime to talk about potential ways to extend this example/model to a system that works in real time, and makes predictions on songs as it hears them through a computer microphone versus an uploaded mp3.
My email is aaronopp@gmail.com if you want to connect!
Thanks,
Aaron
sir ,
i got this error what should i do ?
Negative dimension size caused by subtracting 3 from 1 for 'conv2d_12/convolution' (op: 'Conv2D') with input shapes: [?,1,96,431], [3,3,431,32].
@MrNakum You have to add data_format="channels_first"
to your model.add(Convolution2D
calls like so:
model.add(Convolution2D(nb_filters, kernel_size[0], strides=2,
border_mode='valid', input_shape=input_shape,data_format="channels_first"))
In def shuffle_XY_paths
you make a shallow copy of the path names leading to incorrect results. Use deepcopy instead like so:
def shuffle_XY_paths(X,Y,paths):
newpaths = copy.deepcopy(paths)
Resolved in https://github.com/drscotthawley/panotti
Check there instead.