/time-travel-in-llms

The official repository for the paper entitled "Time Travel in LLMs: Tracing Data Contamination in Large Language Models."

Primary LanguagePythonApache License 2.0Apache-2.0

Time Travel in LLMs: Tracing Data Contamination in Large Language Models Static Badge Static Badge

This repository hosts the codebase for implementing all the methods proposed in the paper entitled "Time Travel in LLMs: Tracing Data Contamination in Large Language Models," authored by Shahriar Golchin* and Mihai Surdeanu.

Explore more resources related to this paper: video, poster, and media.

Overview

Our research is the first to systematically uncover and detect the issue of data contamination in the fully black-box large language models (LLMs). The primary idea revolves around the fact that if an LLM has seen a dataset instance during its pre-training phase, the LLM is able to replicate it. This is supported by two observations: (1) LLMs have enough capacity to memorize data; and (2) LLMs are trained to follow instructions effectively. However, due to the safety filters implemented in LLMs to prevent them from generating copyrighted content, explicitly asking LLMs to reproduce these instances is ineffective, as it triggers safety mechanisms. Our method circumvents these filters by replicating dataset instances given their random-length initial segments. Below is an example of our strategy in action, whereby the subsequent segment of an instance from the train split of the IMDB dataset is exactly replicated by GPT-4.

Getting Started

Installation

Start the process by cloning the repository using the command below:

git clone https://github.com/shahriargolchin/time-travel-in-llms.git

Make sure that you are in the project's directory. If not, navigate to the project's root directory by executing the following command:

cd time-travel-in-llms

Next, establish a virtual environment:

python3.11 -m venv time-travel-venv

Now, activate your environment:

source time-travel-venv/bin/activate

Lastly, use pip to install all the requisite packages:

pip install -r requirements.txt

Important

Note that the aforementioned command installs packages necessary for running evaluations via ROUGE-L and GPT-4 in-context learning (ICL). For evaluation using BLEURT, additional installation is required since it is used as a dependency for this project. To do this, execute the following commands or refer to the BLEURT repository, but ensure it is located in dependencies/bleurt_scorer directory in this project. You may skip these steps if you do not need to perform evaluation using BLEURT.

git clone https://github.com/google-research/bleurt.git dependencies/bleurt_scorer
cd dependencies/bleurt_scorer
pip install .

Then, download the model checkpoint for BLEURT by running the following command (note that we used the BLEURT-20 checkpoint for our study, and the provided command downloads this particular checkpoint. You can use any other checkpoint from the list available here.):

wget https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip
unzip BLEURT-20.zip

Alternatively, if you do not have wget installed, you can use the following command as an alternative:

curl -O https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip
unzip BLEURT-20.zip

Experiments

For all the settings discussed in the paper, we have provided the corresponding bash files in the scripts directory. Upon running these bash scripts, data contamination is detected for the examined subset of data. In the results directory, individual text files are generated for each evaluation method, i.e., ROUGE-L, BLEURT, or GPT-4 ICL, to display pass/fail results for the detected contamination. The input CSV files, along with all the intermediate results, are also stored in the corresponding subdirectories under the results directory.

Usage

Before running experiments, you need to export your OpenAI key to ensure that OpenAI models are accessible. You can do so by using the following command:

export OPENAI_API_KEY=your-api-key

To run an experiment, first navigate to the scripts/dataset-name directory where bash scripts for each partition of a dataset (e.g., train, test/validation) are located. You can do this with the below command (assuming you are in the root directory):

cd scripts/dataset-name

Once in the respective directory, set the bash file to executable by running the following command:

chmod +x bash-file-name.sh

Finally, run the experiment by executing:

./bash-file-name.sh

Citation

Siren If you find our work useful, please use only the following standard format when citing our paper: Siren
@article{DBLP:journals/corr/abs-2308-08493,
  author       = {Shahriar Golchin and
                  Mihai Surdeanu},
  title        = {Time Travel in LLMs: Tracing Data Contamination in Large Language
                  Models},
  journal      = {CoRR},
  volume       = {abs/2308.08493},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2308.08493},
  doi          = {10.48550/ARXIV.2308.08493},
  eprinttype    = {arXiv},
  eprint       = {2308.08493},
  timestamp    = {Thu, 24 Aug 2023 12:30:27 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2308-08493.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Further Reading on Data Contamination

If you are interested in the field of data contamination detection in LLMs, you might find our second paper, Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models (repo available here), particularly useful. In this paper, we present a novel method not only for detecting contamination in LLMs but also for estimating its amount in fully black-box LLMs. For reference, you can cite this paper using the standard citation format provided below:

@article{DBLP:journals/corr/abs-2311-06233,
  author       = {Shahriar Golchin and
                  Mihai Surdeanu},
  title        = {Data Contamination Quiz: {A} Tool to Detect and Estimate Contamination
                  in Large Language Models},
  journal      = {CoRR},
  volume       = {abs/2311.06233},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2311.06233},
  doi          = {10.48550/ARXIV.2311.06233},
  eprinttype    = {arXiv},
  eprint       = {2311.06233},

  timestamp    = {Wed, 15 Nov 2023 16:23:10 +0100},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2311-06233.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}