Pinned Repositories
CLEX
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models
flash-attention
Fast and memory-efficient exact attention
guanzhchen.github.io
PETuning
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
LongAlign
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
guanzhchen's Repositories
guanzhchen/PETuning
guanzhchen/guanzhchen.github.io
guanzhchen/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs