google-deepmind/scalable_agent

Multi-task training

Maxwell2017 opened this issue · 1 comments

In a multi-task training, how do you handle rewards between different tasks? I see CLIPPED_REWARDS in the code. Will the rewards of different types of tasks be added up and then backpropagated? Can you figure it out for me in the code?? @lespeholt

There is no adding. You just train the games in a round robin fashion. One can do much better though, please see https://arxiv.org/abs/1809.04474