Weights computation for MAML in RL Setting
smiler80 opened this issue · 0 comments
smiler80 commented
Hello @sudharsan13296
Thank you for this very interesting work.
I have a question regarding section 6.3 "MAML in Supervised Learning".
While in Supervised learning setting, Step 3: (inner loop) is quite obvious, I'm still not sure how to implement it for Reinforcement learning setting. In fact Di consists of K trajectories each one of horizon H. How should theta'i be computed?
A- For each of the Ks trajectories?
B- At the end of the all Ks trajectories training?
In both cases, do you have an idea on how should gradient-descent/losses be operated (eventually aggregated) to obtain theta'i?
Best Regards,