/MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Watchers