Official implementation for "Mixture of In-Context Experts Enhance LLMs’ Awareness of Long Contexts"
Let’s take Llama2-7b-chat as an example.
-
Create a virtural environment from requirements.txt.
pip install -r requirements.txt
-
Replace original modeling_llama.py with our modeling_llama.py with MoICE.
-
Replace paths in train.sh and train Llama2-7b-chat with MoICE.
bash train.sh
we take the open long-context benchmark Leval as our main evaluation.
@article{lin2024mixture,
title={Mixture of In-Context Experts Enhance LLMs' Long Context Awareness},
author={Lin, Hongzhan and Lv, Ang and Chen, Yuhan and Zhu, Chen and Song, Yang and Zhu, Hengshu and Yan, Rui},
journal={arXiv preprint arXiv:2406.19598},
year={2024}
}