Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
Primary LanguagePythonMIT LicenseMIT