openai/procgen

Miner incorrectly wins game if the agent spawns on the exit due to incorrect diamond count initialization

jordan-schneider opened this issue · 4 comments

I don't know if this counts as something that won't be fixed for backwards compatability, but if the agent starts an episode on the exit (as in seed 1926574277), then

  1. miner.cpp initializes diamonds_remaining as 0
  2. Miner::game_step calls BasicAbstractGame::game_step() first before updating diamonds_remaining
  3. BasicAbstractGame::game_step() calls for collision handling
  4. Miner::handle_agent_collision checks the (unchanged) diamonds_remaining value, sees it's zero, and immediately exits the episode.

This is easily fixed by initializing diamonds_remaining to a sentinel value like -1. If this is something y'all are interested in fixing I'd be happy to write the PR.

Thanks for investigating and letting us know, but unfortunately we don't want to change the behavior of the published environments. I think there has been a couple more reports like this, so if you wanted to make a PR to start a KNOWN_ISSUES.md file with this issue, linked from the README, I'd be happy to approve that.

Could also note the bug in a code comment in miner.cpp so that people would be more likely to see it.

Here is the other issue I found: #59