/Long-context-transformers

Exploring finetuning public checkpoints on filter 8K sequences on Pile

Primary LanguagePython

Issues