titu1994/neural-architecture-search

RNN Controller is implemented differently from the original paper

Kipsora opened this issue · 1 comments

In the paper, the RNN is described as: "Every prediction is carried out by a softmax classifier and then fed into the next time step as input". However, I found in the implementation, the input is the output of previous epoch rather than the output of previous time step. Actually, I am confused with the initial state of RNN. The paper didn't state what the initial state is.

Related codes:

state = state_space.get_random_state_space(NUM_LAYERS)
print("Initial Random State : ", state_space.parse_state_space_list(state))
print()
# clear the previous files
controller.remove_files()
# train for number of trails
for trial in range(MAX_TRIALS):
with policy_sess.as_default():
K.set_session(policy_sess)
actions = controller.get_action(state) # get an action for the previous state
# print the action probabilities
state_space.print_actions(actions)
print("Predicted actions : ", state_space.parse_state_space_list(actions))
# build a model, train and get reward and accuracy from the network manager
reward, previous_acc = manager.get_rewards(model_fn, state_space.parse_state_space_list(actions))
print("Rewards : ", reward, "Accuracy : ", previous_acc)
with policy_sess.as_default():
K.set_session(policy_sess)
total_reward += reward
print("Total reward : ", total_reward)
# actions and states are equivalent, save the state and reward
state = actions

The graph of the rnn is built such that in one pass, all the different classifiers will output sequentially based on each other's last state.

For the first state, it is common to use a zero state vector.