Weights computation for MAML in RL Setting

Question

Weights computation for MAML in RL Setting

smiler80 opened this issue 5 years ago · 0 comments

Thank you for this very interesting work.

I have a question regarding section 6.3 "MAML in Supervised Learning".
While in Supervised learning setting, Step 3: (inner loop) is quite obvious, I'm still not sure how to implement it for Reinforcement learning setting. In fact Di consists of K trajectories each one of horizon H. How should theta'i be computed?

A- For each of the Ks trajectories?
B- At the end of the all Ks trajectories training?

In both cases, do you have an idea on how should gradient-descent/losses be operated (eventually aggregated) to obtain theta'i?

Best Regards,