Agent acts non-deterministically
hfeniser opened this issue · 0 comments
hfeniser commented
I trained an agent in Maze environment of Procgen benchmark. Now. I am testing it on various game levels. However, I noticed that the agent act non-deterministically. For example, I set a game by specifying num_levels=1
and start_level=97
. I get the following sequence of actions taken by the agent in two different runs:
1st play: [7] [8] [5] [5] [5] [5] [2] [5] [2] [5] [5] [8] [8] [5] [5] [5]
2nd play: [8] [8] [5] [5] [5] [5] [2] [2] [5] [5] [8] [8] [5] [5] [5]
Note that the agent is able to get the cheese in every run, although it takes different actions in some steps.