/Energy-Based-Hindsight-Experience-Prioritization

Exploring different buffer sampling techniques to improve Hindisght Experience Replay on continuous control robotic application tasks. Continous action spaces & sparse rewards.

Primary LanguagePython

Energy-Based-Prioritization

Comparison betwween Hindsight Experience Replay and Energy-Based Hindisght Experience Prioritization, averaged across 5 random seeds each, trained on 16 CPUs. (1 epoch = 40,000 iterations)

Results improve on state-of-the-art HER implementation by Andrychowiz et.al. (2017)

image

image


Supported by the AWS Cloud Credits for Research program.