Hotwaterman/Hotwaterman.github.io

one-shot high-fidelity imitation | Jiao's blog

Opened this issue · 0 comments

https://hotwaterman.github.io/post/one-shot-high-fidelity-imitation/

人类是高保真模仿的专家,只需要一次尝试就能模仿一个示范动作。这篇paper引入了off-policy RL算法 MetaMimic,meta mimic可以学习1)不同新技能的高保真one-shot模仿的策略,2)使agent比演示者更有效...