keunwoochoi/kapre

How to use 1-Dimentional(Conv1D, MaxPool1D) layers?

PrabakarSundar opened this issue · 1 comments

I'm trying to replace the audio preprocessing(Logfbank) using Kapre to improve efficiency. I have managed to replicate the function I wanted. But my old model has 1D Conv, Pooling layers. in which the results from Kapre is not supported as it has an extra dimension. The dimension I need is (None,200,26), what I get is (None,200,26,1). This result the error

Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 200, 26, 1]

So, I tried to squeeze the result from Kapre layer using [tf.squeeze()]. Which results in another error,

tld.op_callbacks, input, "squeeze_dims", axis)
tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.

from tensorflow.keras.models import Sequential
        from tensorflow.keras.layers import Conv1D, MaxPool1D, BatchNormalization, ReLU, GlobalAveragePooling1D, Dense, Softmax
        from kapre.composed import get_melspectrogram_layer, get_log_frequency_spectrogram_layer
        model = Sequential()
        logfreq_stft_mag = get_log_frequency_spectrogram_layer(input_shape=input_shape, n_fft=512, \
                                                               win_length=int(sample_rate * 0.025),\
                                                               hop_length=int(sample_rate * 0.01), pad_end=True,\
                                                               log_n_bins=26)
        logfreq_stft_mag = tf.squeeze(logfreq_stft_mag) #using squeeze to reduce dimension to (None,200,26) from (None,200,26,1)
        model.add(logfreq_stft_mag)
        model.add(Dropout(0.1))
        model.add(Conv1D(filters=32, kernel_size=3, strides=1, activation='relu', padding='same'))
        model.add(BatchNormalization())
        model.add(MaxPool1D(strides=2))
        model.add(Conv1D(filters=32, kernel_size=3, strides=1, activation='relu', padding='same'))
        model.add(GlobalAveragePooling1D())
        model.add(Dense(numclass, activation='softmax'))
        model.summary()

How can I use the 1D conv, 1D pooling layers. My constrain is to use 1DConv instead of 2DConv as my end device has very limited resources. Thanks.

tf.squeeze is an operation but what you need is a layer that performs the operation. In your case, specifically, you can just use Reshape layer as an alternative - https://www.tensorflow.org/api_docs/python/tf/keras/layers/Reshape .

model.add(logfreq_stft_mag)
model.add(tf.keras.layer.Reshape((200, 26)))  # batch axis is ignored
...