/DialCoT

DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

Primary LanguagePython