DeepConvLSTM training
aurotripathy opened this issue · 4 comments
Hi:
Your paper is exemplary in laying out the problem as a sequence-learning task and, to top it off, the accompanying "model deploy" script works!
I took a shot at the training script (attached). Being new to LSTMs and Lasagne, I could be missing many of the hyper-parameters that will recreate your model.
Could you please share your training script?
Alternately, could you kindly suggest improvements in the attached training script.
Thank you,
Auro
Hi @aurotripathy, thanks for your comments!. Indeed I'd like to include in the repository more code in the near future, once I find the time. Currently, the only missing code to have a training script are the dropout regularizers in the network description:
net = {}
net['input'] = lasagne.layers.InputLayer((BATCH_SIZE, 1, SLIDING_WINDOW_LENGTH, NB_SENSOR_CHANNELS), input_var=input_var)
net['conv1/5x1'] = lasagne.layers.Conv2DLayer(net['input'], NUM_FILTERS, (FILTER_SIZE, 1))
net['conv2/5x1'] = lasagne.layers.Conv2DLayer(net['conv1/5x1'], NUM_FILTERS, (FILTER_SIZE, 1))
net['conv3/5x1'] = lasagne.layers.Conv2DLayer(net['conv2/5x1'], NUM_FILTERS, (FILTER_SIZE, 1))
net['conv4/5x1'] = lasagne.layers.Conv2DLayer(net['conv3/5x1'], NUM_FILTERS, (FILTER_SIZE, 1))
net['shuff'] = lasagne.layers.DimshuffleLayer(net['conv4/5x1'], (0, 2, 1, 3))
net['lstm1'] = lasagne.layers.LSTMLayer(lasagne.layers.dropout(net['shuff'], p=.5), NUM_UNITS_LSTM)
net['lstm2'] = lasagne.layers.LSTMLayer(lasagne.layers.dropout(net['lstm1'], p=.5), NUM_UNITS_LSTM)
# In order to connect a recurrent layer to a dense layer, it is necessary to flatten the first two dimensions
# to cause each time step of each sequence to be processed independently (see Lasagne docs for further information)
net['shp1'] = lasagne.layers.ReshapeLayer(net['lstm2'], (-1, NUM_UNITS_LSTM))
net['prob'] = lasagne.layers.DenseLayer(lasagne.layers.dropout(net['shp1'], p=.5),NUM_CLASSES, nonlinearity=lasagne.nonlinearities.softmax)
# Tensors reshaped back to the original shape
net['shp2'] = lasagne.layers.ReshapeLayer(net['prob'], (BATCH_SIZE, FINAL_SEQUENCE_LENGTH, NUM_CLASSES))
# Last sample in the sequence is considered
net['output'] = lasagne.layers.SliceLayer(net['shp2'], -1, 1)
and the training piece of code. In Lasagne:
train_prediction_rnn = lasagne.layers.get_output(net['output'], input_var, deterministic=False)
train_loss_rnn = lasagne.objectives.categorical_crossentropy(train_prediction_rnn, target_var)
train_loss_rnn = train_loss_rnn.mean()
train_loss_rnn += .0001 * lasagne.regularization.regularize_network_params(net['output'], lasagne.regularization.l2)
params_rnn = lasagne.layers.get_all_params(net['output'], trainable=True)
lr=0.0001
rho=0.9
updates_rnn = lasagne.updates.rmsprop(train_loss_rnn, params_rnn, learning_rate=lr, rho=rho)
train_fn_rnn = theano.function([input_var, target_var], train_loss_rnn, updates=updates_rnn)
When working with the rmsprop updater I'd recommend to start with very small learning rates, such as 0.0001, since it is a updating rule which displaces the gradient very fast.
Thank you for the helpful tips on training!
I look forward to your additional code.
Hi :
I have download this project and run success , it's very cool ! I see this issue but I don't know how to add in the DeepConvLSTM.ipynb.......please
Could you please share your training code?
Thank you,
xiaomoer
Hi
Thanks for your code but I have a real problem when I mixed the training code with the testing part, the model based on your mentioned parameters can not learn anything. Your code is working for me just with your DeepConvLSTM_oppChallenge_gestures.pkl but I need to train model and then test it. Please, provide us your complete code with training and testing.
Thanks,
Roghayeh