Can you provide the processed csv files again ?
LZY-the-boys opened this issue · 3 comments
Thanks for your awesome work! I am very interested in this work, but recently the repo seems over its lfs quota, I cannot download via lfs:
batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
error: failed to fetch some objects from 'https://github.com/eth-sri/language-model-arithmetic.git/info/lfs'
Though you have provided raw dataset file sources, I notice that the Politically Incorrect 4chan Messages dataset is too large (24G) for me to download. So I kindly request processed csv files.
Thanks for pointing this out. We will fix this. In the meantime, please download the datasets by following this link: https://polybox.ethz.ch/index.php/s/WdPq20k5GVqrqGW. You can unzip the folder and place it in the data/
folder (such that the path to each dataset becomes data/datasets/benchmark_name.csv.
We have updated the instructions to download the processed dataset files via our webpage (https://files.sri.inf.ethz.ch/language-model-arithmetic/) instead of using lfs. This should fix your issue :)
Thank you very much !