Pinned Repositories
Step-DPO
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
LUOMU17's Repositories
LUOMU17 doesn’t have any repository yet.
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
LUOMU17 doesn’t have any repository yet.