share sft-dataset

Question

share sft-dataset

yyht opened this issue 5 months ago · 4 comments

yyht commented 5 months ago

hello, nice work. could share the sft-dataset in hf?

Answer 1 · 2024-06-28T05:49:46.000Z

Sure, it will be released soon. Please stay tuned.

Answer 2 · 2024-07-10T23:41:03.000Z

Hi authors, following up on this thread to stay updated when the SFT datasets are released. Thanks and nice work!

Answer 3 · 2024-08-19T03:10:02.000Z

Hi authors, it is a nice work to advance the off-policy method for enhancing reason ability of LLM. I am following up on this thread to stay updated when the SFT datasets are released. Thanks!

Answer 4 · 2024-08-20T02:53:06.000Z

hello everyone, https://huggingface.co/datasets/yingyingzhang/metamath-qwen2-math .
I use qwen2-math-instruct and open-source-datasets such as metamath-qa and numina-cot to construct a high quality sft-dataset.
When finetuning on qwen2-general-base or qwen2-math-base, the sft model could achieve comparable results to qwen2-instruct-7b\72b and qwen2-math-7b-instruct.
The whole datasets contains metamath-qwen2-math and none-synthetic datasets from https://huggingface.co/datasets/AI-MO/NuminaMath-CoT.
Please enjoy it.