/POMDP-ac-cartpole

solving POMDP by LSTM in gym.cartpole environment, in pytorch

Primary LanguagePython

POMDP-ac-cartpole

solving POMDP by LSTM in gym.cartpole environment, in pytorch

Requirements

  • tensorflow (for tensorboard logging)
  • pytorch (>=1.0, 1.0.1 used in my experiment)
  • gym

POMDP setting

the idea of convert Cartpole-v0 into a POMDP task comes from HaiyinPiao

and the full observation of cartpole in gym is in 4 dimensions :

  1. cart position (-4.8, 4.8)
  2. cart velocity (-inf, inf)
  3. pole angle (-24°, 24°)
  4. pole velocity at tip (-inf, inf)

and we can delete one or more dimensions of the standard states and make the task become a partial observed markov decision process (POMDP).

Delete the cart velocity

LSTM no LSTM
LSTM without LSTM

Delete the cart velocity and pole velocity

LSTM no LSTM
LSTM without LSTM

Conclusion

When the partial observability becomes more severe, LSTM would significantly improving the performance of RL agent.