cat-stack-boop/Stable-Alignment
Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
PythonNOASSERTION
No issues in this repository yet.
Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
PythonNOASSERTION
No issues in this repository yet.