微调以后同一个checkpoint在evaluate模式和部署模式下,同一份验证集的效果相差非常大
BirderEric opened this issue · 0 comments
BirderEric commented
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
微调以后同一份验证集和同一个checkpoint,通过evalute脚本predict出来的结果跟通过部署方式predict出来的结果相差非常大,准确率分别为83%,39%,有大佬们遇到相同的情况吗?
Expected Behavior
No response
Steps To Reproduce
- ptuning with my own train.json
- predict dev.json with evaluate.sh using checpoint
- predict dev.json with model.chat function using the same chpoint
- different result and precision
Environment
- OS: Ubuntu 20.04
- Python:3.9
- Transformers:4.33.1
- PyTorch:2.0.1
- CUDA Support:true
Anything else?
No response