emasquil/ppo

Replay Buffer

Closed this issue · 0 comments

There are several ways of doing it:

  • Parallel actors collect samples. Each actor collect one sample. Like in the last TP. (Problem colab is not efficient with parallel computing).
  • In a class. Like in the TD1. I will go for this