philipperemy/keras-tcn

Low accuracy in keras-tuner based TCN model for audio classification

gehaniaarti opened this issue · 6 comments

I am trying to classify audio signals using TCN. For obtaining the TCN hyperparameters, I am using keras-tuner. It runs perfectly fine but I am getting low training and testing accuracies (something like 26% and 17%, respectively). The code is as follows:

num_classes = Y.shape[1]
batch_size = X_train.shape[0]
num_rows = X_train.shape[1]
num_columns = X_train.shape[2]
time_stamp= num_rows * num_columns
input_dim = 1 

X_train = X_train.reshape(X_train.shape[0], time_stamp, input_dim)
X_test = X_test.reshape(X_test.shape[0], time_stamp, input_dim)
 
def build_model(hp):

      i = Input(batch_shape=(None, time_stamp, input_dim))
      o = TCN(hp.Int('nb_filters', min_value=32, max_value=512, step=32), 
              kernel_size=5, nb_stacks=1, 
              dilations=[1,2,4], return_sequences=False, padding = 'causal', use_skip_connections=True)(i)  # The TCN layers are here.
      o = Dense(hp.Choice('nodes',values=[32,64,128,256]), activation='relu') (o)
      o = Dropout(hp.Choice('dropout',values=[0.1,0.2]))(o)
      o = Dense(num_classes, activation = 'softmax')(o)
      
      model = Model(inputs=[i], outputs=[o])
      
      model.summary()
      
      model.compile(loss='categorical_crossentropy', 
                    metrics=['accuracy'], 
                    optimizer='adam')
      
      return model
  
tuner = RandomSearch(build_model, objective='val_accuracy',
                    max_trials = 3, executions_per_trial=1, overwrite=True)
tuner.search_space_summary()
  
tuner.search(x=X_train, y=Y_train, epochs=50, batch_size=32, validation_data=(X_test, Y_test))
tuner.results_summary()  
     
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
print(best_hps.values)  

model = tuner.hypermodel.build(best_hps)
history = model.fit(X_train, Y_train, epochs=50, validation_split=0.2)

val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))
  
hypermodel = tuner.hypermodel.build(best_hps)

hypermodel.fit(X_train, Y_train, epochs=best_epoch)
  
eval_result = hypermodel.evaluate(X_test, Y_test)
print("[test loss, test accuracy]:", eval_result)

The shapes for the training, testing and validation datasets are as follows: shape of X_train is: (219, 99, 32), shape of X_Val is: (27, 99, 32), shape of X_Test is: (28, 99, 32), shape of Y_train is: (219, 5), shape of Y_Val is: (27, 5), shape of Y_Test is: (28, 5)

A similar approach gave me good results with CNN, CRNN and MLP but is not working here.

@gehaniaarti have you tried with the default parameters? It looks like your dataset is very small. TCN are harder to train than CNN and MLP. If you look at the examples, TCN need tens of thousands of example to perform well. Also what data are you using for audio classification? Spectrogram? Wavelengths?

Default parameters means just calling TCN() like this.
Yes I think your dataset is too small to have something that can work reasonably well.

Yes the dataset is too small.