pat-coady/trpo

add command line arguments for network sizing and initial policy variance

pat-coady opened this issue · 1 comments

  1. Make hidden layer 1 size adjustable from command line. Will specify as a multiple of observation dimension size. Present code has it hard-coded as 10x observation dimension. Will use same size for value function NN and policy NN.

  2. Make initial policy variance configurable. Presently each action dimension starts with a variance of 0.1.

Done.

commit 7c61906