RNN Controller is implemented differently from the original paper
Kipsora opened this issue · 1 comments
Kipsora commented
In the paper, the RNN is described as: "Every prediction is carried out by a softmax classifier and then fed into the next time step as input". However, I found in the implementation, the input is the output of previous epoch rather than the output of previous time step. Actually, I am confused with the initial state of RNN. The paper didn't state what the initial state is.
Related codes:
neural-architecture-search/train.py
Lines 67 to 95 in d5f5c9d
titu1994 commented
The graph of the rnn is built such that in one pass, all the different classifiers will output sequentially based on each other's last state.
For the first state, it is common to use a zero state vector.