State and Return preds input
backpropper opened this issue · 1 comments
backpropper commented
The comment on the following line and the line after says that the return and state predictions are output using both the state and action as inputs. Although the equation only seems to use the action information (index 2). Am I missing something or is there some ambiguity? I know that it won't affect the learning since we are only using the action predictions.