lcswillems/rl-starter-files

Data mixed from different parallel environments

asiffiiqbal opened this issue · 1 comments

https://github.com/lcswillems/torch-rl/blob/c33bf422aad70be89498fc712a7bed56aa2512aa/torch_rl/torch_rl/algos/base.py#L126

I think the data from different environments are getting mixed here.
the "preprocessed_obs" seems to receive observations from the parallel environment and then it gets forwarded to the model.
My understanding was observations from a specific environment should only get to the model and then based on the model's prediction you would select an action for that specific environment. But it seems all the observations from the parallel environments are forwarded to the model.
please correct me if I am wrong.

Yes, all the observations from the parallel environments are forwarded to the model. What is the problem?

The model chooses an action from an observation and its previous state. So, I can put in the model all the observations of the parallel envs along with all the corresponding previous states.

I close this issue because I don't understand the problem and don't have enough information to identify it. However, we can continue the discussion and I might open this issue again.