can i apply openai baseline on this?
Closed this issue · 1 comments
Unimax commented
i was wondering if i can apply DQN , PPO etc baseline by openAI on this directly?
genyrosk commented
The new v1
env can be used directly with baselines as of version 0.3.2. Note that in order for that to work out of the box the environment returns a special reward when an invalid action is selected, thus your agent would have to learn the rules before learning an optimal policy.
The v2
environment (compiled in Rust and over 100 times faster than v1
) doesn't currently support baselines out of the box due to a more complex state that is returned at each step.
Apologies for the delay in replying, I hope this helps 🙏