Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
@inproceedings{chetarget, title={Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation}, author={Che, Fengdi and Xiao, Chenjun and Mei, Jincheng and Dai, Bo and Gummadi, Ramki and Ramirez, Oscar A and Harris, Christopher K and Mahmood, A Rupam and Schuurmans, Dale}, booktitle={Forty-first International Conference on Machine Learning} }
The first figure can be gained by run baird.py. The feature and transition information is contained in the file.
The second figure can be gained by run learn_four_room.py. The environment information is contained in the four_room.py file.