/Prometheus2_SFT_distillation

Prometheus 2 is an alternative of GPT-4 evaluation when doing fine-grained evaluation of an underlying LLM & a Reward model for Reinforcement Learning from Human Feedback (RLHF).

Primary LanguageJupyter Notebook

Watchers