Initial state in model
Nicolabo opened this issue · 3 comments
Hi Garett,
Great article and github repo. I have a question regarding initial state in your LSTM model. During training you initialised a state for each epoch, but within epoch you implement stateful LSTM to transfer initial.state
from batch to batch.
state = sess.run(initial_state)
train_acc = []
for ii, (x, y) in enumerate(utl.get_batches(train_x, train_y, batch_size), 1):
feed = {inputs_: x,
labels_: y[:, None],
keep_prob_: keep_prob,
initial_state: state}
loss_, state, _, batch_acc = sess.run([loss, final_state, optimizer, accuracy], feed_dict=feed)
But in your graph your initial.state is not a placeholder.
initial_state = cell.zero_state(batch_size, tf.float32)
My understanding is that it's not gonna work as you expect. It will always be zero.state. Please correct me if I am wrong.
Hi @Nicolabo!
Anytf.placeholder
must be passed into your network, but any variable in your DAG can actually be passed through if you choose to. Check out the example in the Google Colab notebook linked below and let me know if you have any other questions!
https://colab.research.google.com/drive/1JSpCLxmYuAPslH4Ixt12RvLzLYh8sPKR
Thanks, I didn't know that!
One more thing when it comes to initial.state
. During prediction, you also divided data into batches and update state from batch to batch. I am not quite sure why is that. It seems some predicted values have impact on other predicted values. Am I wrong?
Hmm, that's a really good point and something I may have overlooked when putting this together. Let me look into this and get back to you.