/trpo_schedule_kl

Scheduling TRPO's KL Divergence Constraint

Primary LanguageJupyter Notebook

Scheduling Parameters with ReLax (TRPO step KL divergence)

This repository contains a demonstration of scheduling possibilities in ReLax (TRPO step KL divergence). Plot below shows a theoretical (scheduled) step KL-divergence versus an actual (derived with estimating Fisher vector product) for TRPO-GAE algorithm. This schedule is sub-optimal in terms of training performance and built for demonstration purposes only.

kl_div_plot