torcs-reinforcement-learning

RL for path planning

Q learning with fixed intra-policy: 1, try different neural network size 2, use more complex training condition 3, adjust low level controller for throttle 4, try different option lasting steps