RLHF-Blender

Warning

Right now, RLHF-Blender is still in preview mode and might therefore contain bugs or will not immediately run on each system. We are working on a stable release.

Implementation for RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback Paper: https://arxiv.org/abs/2308.04332 (Presented at the ICML2023 Interactive Learning from Implicit Human Feedback Workshop)

Website + Demo: https://sites.google.com/view/rlhfblender

Documentation: https://rlhfblender.readthedocs.io/en/latest/

Note

The following repository is part of of the RLHF-Blender project. The frontend is part of a separate repository: RLHF-Blender-UI you may follow the installation instructions to also install the frontend

Installation

Clone the repository

git clone https://github.com/ymetz/rlhfblender.git
cd rlhfblender
git submodule update --init rlhfblender-ui

to get both the main repository,user interface. If you want to download both the repository and demo models, you can also run git clone --recurse-submodules https://github.com/ymetz/rlhfblender.git.

Docker-Installation

docker-compose up

(3. Optional: Local/Dev. Install):

pip install -e .
python rlhfblender/app.py

and

cd rlhfblender-ui
npm install
npm run start

The user interface is then available at http://localhost:3000

📦 Features

RLHF-Blender allows to configure experimental setups for RLHF-experiments based on several modular components:

A freely configurable user interface for different feedback type interactions
Feedback processors, handling the translation of different types of feedback, incl. meta-data, into a common format
Adaptor to different reward models (e.g. reward model ensembles, AIRL-style models, etc.)

📖 Example

RLHF-Blender allows to quickly setup experiments for experimenting with different types of feedback and reward models across different environments. The following example shows how to setup an experiment for the CartPole environment with a reward model ensemble and a textual feedback interface.

🎯 What's next

We hope, that we can extend the functionality of RLHF-Blender in the future. In case you are interested, feel free to contribute. Planned features are:

Support of additional environments
Support of additional feedback types (e.g. textual feedback)
Further improvements of user interface, analysis capabilities
Improved model training support

🛡 License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

📃 Citation

@article{metz2023rlhf,
  title={RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback},
  author={Metz, Yannick and Lindner, David and Baur, Rapha{\"e}l and Keim, Daniel A and El-Assady, Mennatallah},
  year={2023},
  journal={https://openreview.net/pdf?id=JvkZtzJBFQ},
  howpublished = {\url{https://github.com/ymetz/rlhfblender}}
}