/PoSE

Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)

Primary LanguagePythonMIT LicenseMIT