NotAnyMike/HRL

Train NWOO

Closed this issue · 0 comments

  • change max_steps to 0
  • Train unbalanced version
  • Balance tracks with X and Y intersections
  • Create suboptimal policy
  • Train balanced version
  • Train it three times longer
  • Considering showing the directional earlier