RPS is not deterministic and not seeded correctly.
MiladInk opened this issue · 1 comments
Describe the bug
I am just running the rock paper scissor example of pettingzoo here. I found out if I run the code multiple times, I get different plays. This is a bug. I think it is because the passed seed is not used at all in the code. However, I am not sure how we seed the environments in petting zoo. In Gym, we seed, and then use the env.np_seed to generate stuff if IIRC.
Code example
No response
System info
No response
Additional context
No response
Checklist
- I have checked that there is no similar issue in the repo
I am also finding out that the RSP parallel example is using
observation, infp = env.reset()
while reset returns nothing.
I can fix this one and make a pull request but don't know where the tutorial resides in the repo.
Edit: I currently understand what is going on. The example code is actually converting the environment from a parallel environment to an AEC one. However, the code that is provided right after that in the tutorial is assuming a parallel environment. This is scary! Is petting zoo not supported well? I didn't expect to run into these problems in the tutorial 😨