Issues
- 1
packing loss 的归一化问题
#12 opened by Chandler-Bing - 2
Question about ShareGPT data
#10 opened by S1s-Z - 3
关于packing loss的问题
#8 opened by mxjmtxrm - 1
Needel_test CUDA OOM 了应该怎么解决?
#11 opened by SefaZeng - 2
Prediction Results of LongBench-Chat
#9 opened by guanzhchen - 1
How to get length value in the dataset
#7 opened by Geaming2002 - 3
- 1
- 7
Questions about ChatGLM3-6b-128k
#2 opened by KaiLv69 - 3
关于Packing和 直接Batch的loss区别?
#3 opened by BitVoyage