princeton-nlp/ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
PythonMIT
Issues
- 1
Data Recipe
#2 opened by lwang2070 - 5
S3 Bucket Access
#7 opened by gokulr-cerebras - 1
Data Processing
#8 opened by jialianwww - 2
Fine-tuning 64k OOM
#10 opened by xjwhy - 1
- 3
Streaming Dataset
#12 opened by JialianW - 2
Multi-node Training
#9 opened by chtmp223 - 1
What's the change to Llama model ?
#6 opened by wtzhang99 - 2
- 2
Specifying epochs instead of steps
#4 opened by lilakk - 1
- 0
Code Release
#1 opened by wangxidong06