Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

This is a pytorch implementation of SfBC: Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling.

* For diffusion-based offline RL, we recommend trying our subsequent work, QGPO(paper; Github). Compared with SfBC, QGPO has improved computational efficiency and noticeably better performance.

Requirements

See conda requirements in requirements.yml

Quick Start

Train the behavior model:

$ python3 train_behavior.py

Train the critic model and plot evaluation scores with tensorboard:

$ python3 train_critic.py

Evaluation only:

$ python3 evaluation.py

Citing

If you find this code release useful, please reference in your paper:

@inproceedings{
chen2023offline,
title={Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling},
author={Huayu Chen and Cheng Lu and Chengyang Ying and Hang Su and Jun Zhu},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
}

ChenDRAG/SfBC

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Requirements

Quick Start

Citing

Note