
Example CEM implementation with ReLAx

Primary LanguageJupyter Notebook

Example CEM implementation with ReLAx

This repository contains an implementation of cross entropy method (CEM) with ReLAx.

CEM actor was trained on HalfCheetah-v2 Mujoco Gym environment for 50k env-steps.

The graph of average return vs training step is shown below (batch_size=5000):


The graph below shows actual rewards vs rewards fitted with environment model:


Resulting Policy:
