learning-from-human-feedback
There are 5 repositories under learning-from-human-feedback topic.
forhaoliu/chain-of-hindsight
Chain-of-Hindsight, A Scalable RLHF Method
haozheji/exact-optimization
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
junchenzhi/Neural-Hidden-CRF
Code for the KDD-2023 paper: Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler
junchenzhi/Awesome-Weak-Supervision-Sequence-Labeling
A curated list of awesome Weak-Supervision-Sequence-Labeling (WSSL) papers, methods & resources.
ja2la/Learning-Behaviors-with-Uncertain-Human-Feedback-using-Speech-Recognition
Learning Behaviors with Uncertain Human Feedback using Speech Recognition