/LCF

Primary LanguagePython

LCF

This is the code of the paper: Enhancing Large Language Model Inference Efficiency via Lookahead Cache Filtering

Env

pip install -r requriments.txt

Attention code

modify_llama.py

GMM

matmul_ops.cpp