dataset argument for qa.py not specified
Closed this issue · 2 comments
vkaul11 commented
In the sample command you specify for qa.py, you don't specify the dataset argument https://github.com/hsiehjackson/RULER/blob/main/scripts/data/synthetic/qa.py#L58
and I am getting this error. Can you let me know what dataset should be ? I suppose you pass those somewhere when you run things end to end?
(long-context) vivekkaul@Viveks-MacBook-Pro synthetic % python qa.py \
--save_dir=./ \
--save_name=qa \
--tokenizer_path=tokenizer.model \
--tokenizer_type=hf \
--max_seq_length=4096 \
--tokens_to_generate=128 \
--num_samples=10 \
--template="Answer the question based on the given documents. Only give me the answer and do not output any other words.\n\nThe following are given documents.\n\n{context}\n\nAnswer the question based on the given documents. Only give me the answer and do not output any other words.\n\nQuestion: {query} Answer:"
usage: qa.py [-h] --save_dir SAVE_DIR --save_name SAVE_NAME [--subset SUBSET] --tokenizer_path TOKENIZER_PATH [--tokenizer_type TOKENIZER_TYPE]
--max_seq_length MAX_SEQ_LENGTH --tokens_to_generate TOKENS_TO_GENERATE --num_samples NUM_SAMPLES [--pre_samples PRE_SAMPLES]
[--random_seed RANDOM_SEED] --template TEMPLATE [--remove_newline_tab] --dataset DATASET
qa.py: error: the following arguments are required: --dataset
hsiehjackson commented
We generate our dataset using prepare.py
in here. If you want to directly use qa.py
, you can set --dataset squad
or --dataset hotpotqa
. We use both for RULER.
vkaul11 commented
Thanks a lot!