HKUST-KnowComp/Knowledge-Constrained-Decoding

More instruction for train_t5_token_classifier?

Opened this issue · 1 comments

Hi authors,

I rlly appreciate your great work. I am trying to run the train_t5_token_classifier.sh for FUDGE.

Could you kindly add more instructions on how to generate the $DATADIR/wow_dev_unseen_augmented?

train_data_path=$DATADIR/wow_train_augmented
validation_data_path=$DATADIR/wow_dev_unseen_augmented

I have run the below cmds and got a wow_train_augmented_neg_google-flan-t5-xl and wow_train_augmented_neg_random.

bash scripts/shell/data_process/partial_neg_gen.sh 0 wow 16
bash scripts/shell/data_process/random_neg.sh wow

Thanks!

In those 2 files, you can edit the data_options variable from train.jsonl to dev_unseen.jsonl to achieve augmented dev dataset. That is, after you run the preprocess.sh to obtain dev_unseen.jsonl file.