Train_network.py can't handle smaller mel spectrogram shapes

Question

Train_network.py can't handle smaller mel spectrogram shapes

Closed this issue 6 years ago · 4 comments

Hi Dr. Hawley,

I noticed a small problem in the code in both train_network and eval_network- there is no error handling for files that produce spectrograms smaller than width 1293. This happens leads when the training data is created from the mel spectrograms (X_train[train_count,:,:] = melgram, around line 140).
You have written code to chop off the extra width if it is too long ( melgram = melgram[:,:,:,0:mel_dims[3]] ) but nothing to account for melgrams being too short.
I was able to get around it by filling the empty space with 0's, but I thought it would be helpful to let you know!

Also- if you are interested, I would love to connect with you sometime to talk about potential ways to extend this example/model to a system that works in real time, and makes predictions on songs as it hears them through a computer microphone versus an uploaded mp3.
My email is aaronopp@gmail.com if you want to connect!

Thanks,

Aaron

Answer 1 · 2017-07-03T17:04:57.000Z

sir ,

i got this error what should i do ?

Negative dimension size caused by subtracting 3 from 1 for 'conv2d_12/convolution' (op: 'Conv2D') with input shapes: [?,1,96,431], [3,3,431,32].

Answer 2 · 2017-08-05T11:10:56.000Z

@MrNakum You have to add data_format="channels_first" to your model.add(Convolution2D calls like so:

model.add(Convolution2D(nb_filters, kernel_size[0], strides=2,
                        border_mode='valid', input_shape=input_shape,data_format="channels_first"))

Answer 3 · 2017-08-05T11:14:30.000Z

In def shuffle_XY_paths you make a shallow copy of the path names leading to incorrect results. Use deepcopy instead like so:

def shuffle_XY_paths(X,Y,paths):  
    newpaths = copy.deepcopy(paths)

Answer 4 · 2018-04-19T22:24:13.000Z

Resolved in https://github.com/drscotthawley/panotti

Check there instead.