/DMP_baselines

Baselines and comparison algorithms for background work on "Learning Cost Functions From Demonstrations for Contact-Rich Robot Manipulation"

Primary LanguagePython

This repository contains baseline implementations of several methods to better understand amd compare with a proposed method for imitation learning via diffusion

Adversarial Motion Priors

Cross Entropy Method (for optimization)

Kobilarov M. Cross-entropy motion planning. The International Journal of Robotics Research. 2012 Jun;31(7):855-71.

Botev ZI, Kroese DP, Rubinstein RY, L’Ecuyer P. The cross-entropy method for optimization. InHandbook of statistics 2013 Jan 1 (Vol. 31, pp. 35-59). Elsevier.

CEM Algorithm

  1. Assuming that actions are conditioned on the current state and are normally distributed, choose initial parameters $\mu^{(0)}$ and $\sigma^{(0)};$ set $t$ = 1

  2. Sample N actions $X_1, X_2, ..., X_n$ from Gaussian distribution with mean and variance $\mu^{(t)}, \sigma^{(t)}$

  3. Select the best Ne samples to update $\mu^{(t)}, \sigma^{(t)}$ (this can also be done recursively)

  4. Stop if convergence criteria are satisfied; otherwise, increase $t$ by 1 and repeat from step 2.

CEM Implementation w/ IsaacGym

To run the isaacgym version of cem for cartpole execute python cem_cartpole.py within the rlgpu conda env.