
Baselines and comparison algorithms for background work on "Learning Cost Functions From Demonstrations for Contact-Rich Robot Manipulation"

This repository contains baseline implementations of several methods to better understand amd compare with a proposed method for imitation learning via diffusion

Adversarial Motion Priors

Cross Entropy Method (for optimization)

CEM Algorithm

  1. Assuming that actions are conditioned on the current state and are normally distributed, choose initial parameters $\mu^{(0)}$ and $\sigma^{(0)};$ set $t$ = 1

  2. Sample N actions $X_1, X_2, ..., X_n$ from Gaussian distribution with mean and variance $\mu^{(t)}, \sigma^{(t)}$

  3. Select the best Ne samples to update $\mu^{(t)}, \sigma^{(t)}$ (this can also be done recursively)

  4. Stop if convergence criteria are satisfied; otherwise, increase $t$ by 1 and repeat from step 2.

CEM Implementation w/ IsaacGym

To run the isaacgym version of cem for cartpole execute python cem_cartpole.py within the rlgpu conda env.