Help with converting to Keras model
oveddan opened this issue · 7 comments
First of all, thank you for creating this wonderful repository. I'm the one who ported PoseNet into tensorflow.js and would love to feed the output of that into this rnn model in real-time. The easiest way to do this would be to convert it to a keras model and run the tensorflow.js converter on it.
Anyways, I've attempted to convert the tensorflow graph from this jupyter notebook into a Keras model, but the model is not properly training - it stops converging and gets stuck at around 20% validation accuracy.
Here is my keras model buildling and training code:
X_train = load_X(X_train_path)
X_test = load_X(X_test_path)
y_train = load_y(y_train_path)
y_test = load_y(y_test_path)
set_parameters(X_train, X_test, y_test)
n_input = len(X_train[0][0])
model = Sequential([
# relu activation
layers.Dense(n_hidden, activation='relu',
kernel_initializer='random_normal',
bias_initializer='random_normal',
batch_input_shape=(batch_size, n_steps, n_input)
),
layers.LSTM(n_hidden, return_sequences=True, stateful=True, unit_forget_bias=1.0, batch_input_shape=(batch_size, n_steps, n_input)),
layers.LSTM(n_hidden, return_sequences=True, stateful=True, unit_forget_bias=1.0),
layers.LSTM(n_hidden),
layers.Dense(n_classes, kernel_initializer='random_normal',
bias_initializer='random_normal',
kernel_regularizer=regularizers.l2(lambda_loss_amount),
bias_regularizer=regularizers.l2(lambda_loss_amount),
activation='softmax'
)
])
model.compile(
optimizer=optimizers.Adam(lr=learning_rate, decay=decay_rate),
metrics=['accuracy'],
loss='categorical_crossentropy'
)
y_train_one_hot = keras.utils.to_categorical(y_train, 6)
y_test_one_hot = keras.utils.to_categorical(y_test, 6)
train_size = X_train.shape[0] - X_train.shape[0] % batch_size
test_size = X_test.shape[0] - X_test.shape[0] % batch_size
model.fit(
X_train[:train_size,:,:],
y_train_one_hot[:train_size,:],
epochs=50,
batch_size=batch_size,
shuffle=False,
validation_data=(X_test[:test_size,:,:], y_test_one_hot[:test_size,:])
)
What am I doing wrong?
Hey Dan,
Sounds like a cool project, happy to help out.
I haven't used Keras too much, but it looks from your code that your adding an additional LSTM layer without specifying the same params. Maybe it isn't returning the entire sequence?
I'll have a look at training it myself
Main issue I found was that decay in keras is different to tensorflow: 0.96 in tf would be 0.04 in keras.
Also, as each sample in my X_train was already of length n_steps (ie I'm not grabbing the next window from one large timeseries) I didn't use stateful LSTMs, and used shuffle=True in model.fit()
I also removed the extra LSTM layer you had, although I tested with it in and it still converged.
I found that using a smaller batch size converged in less epochs, but it would converge at most sizes.
Training code:
# Input Data
training_data_count = len(X_train) # 4519 training series (with 50% overlap between each serie)
test_data_count = len(X_test) # 1197 test series
n_input = len(X_train[0][0])
n_hidden = 34 # Hidden layer num of features
n_classes = 6
learning_rate = 0.0025 #used if decaying_learning_rate set to False
decay_rate = 0.02 #the base of the exponential in the decay
lambda_loss_amount = 0.0015
training_epochs = 100
batch_size = 1024
model = Sequential([
# relu activation
layers.Dense(n_hidden, activation='relu',
kernel_initializer='random_normal',
bias_initializer='random_normal',
batch_input_shape=(batch_size, n_steps, n_input)
),
layers.LSTM(n_hidden, return_sequences=True, unit_forget_bias=1.0),
layers.LSTM(n_hidden, unit_forget_bias=1.0),
layers.Dense(n_classes, kernel_initializer='random_normal',
bias_initializer='random_normal',
kernel_regularizer=regularizers.l2(lambda_loss_amount),
bias_regularizer=regularizers.l2(lambda_loss_amount),
activation='softmax'
)
])
model.compile(
optimizer=optimizers.Adam(lr=learning_rate, decay=decay_rate),
metrics=['accuracy'],
loss='categorical_crossentropy'
)
y_train_one_hot = keras.utils.to_categorical(y_train, 6)
y_test_one_hot = keras.utils.to_categorical(y_test, 6)
train_size = X_train.shape[0] - X_train.shape[0] % batch_size
test_size = X_test.shape[0] - X_test.shape[0] % batch_size
history = model.fit(
X_train[:train_size,:,:],
y_train_one_hot[:train_size,:],
epochs=training_epochs,
batch_size=batch_size,
validation_data=(X_test[:test_size,:,:], y_test_one_hot[:test_size,:])
)
Training output:
Wow, this is amazing, thanks so much! Can't wait to try this out.
can you please share the full code of keras model. and how can i predict on any video please give me an example.
Hi likhithakarusala,
That's pretty much the entire code. The only missing part is the data preprocessing (see the 'Preparing dataset' section). See issue #2 about online inference from a camera in real time.
can anyone help me with the inference code for the Keras Model?
Hello there,
I would like to ask whether the train_set is same as the original one ?
The shape of x_train should be (total_number_of_samples/ length of sequence, length_of_sequence, number_of_keypoints) ?