imoneoi/openchat

Why the trained model does not produce the answer provided in the training data?

Opened this issue · 0 comments

I followed the format C-RLFT provided in readme, and created this data:
{"items":[{"role":"user","content":"Where is PingHu?","weight":0.0},{"role":"assistant","content":"PingHu is near Shanghai.","weight":1.0}],"condition":"GPT4","system":""}
{"items":[{"role":"user","content":"Where is PingHu?","weight":0.0},{"role":"assistant","content":"I don't know.","weight":0.1}],"condition":"GPT3","system":""}

I trained the model for 20 epochs and saved every 2, but all the models that I tested all produced the answer from
the original model (imone/Mistral_7B_with_EOT_token). No trained model gave the expected answer: PingHu is near Shanghai.