RACo: A repository from wyu97

Code for Retrieval Augmentation for Commonsense Reasoning

Introduction of RACo

This is the official resources of our EMNLP 2022 paper "Retrieval Augmentation for Commonsense Reasoning: A Unified Approach" [arXiv].

Step0 Download the Commonsense Corpus

Corpus (20M): Google drive [link]
Code: Official DPR code [link]
- first run python merge-corpus.py to construct corpus
- modify the retrieval corpus path in above the DPR code

Step1 Training: Commonsense Retriever

Training Data: Google drive [link]

Code: Official DPR code, same as above.

modify the training data path as

raco_train:
    _target_: dpr.data.biencoder_data.JsonQADataset
    file: {your folder path}/train.json

raco_dev:
    _target_: dpr.data.biencoder_data.JsonQADataset
    file: {your folder path}/dev.json

Step1 Inference: Retrieve Documents

Inference Data: Google drive [link]

Code: Official DPR code, same as above.

modify the inference data path as

{dataset}_train:
    _target_: dpr.data.retriever_data.CsvQASrc
    file: {your folder path}/{dataset}/train.tsv

{dataset}_dev:
    _target_: dpr.data.retriever_data.CsvQASrc
    file: {your folder path}/{dataset}/dev.tsv

{dataset}_test:
    _target_: dpr.data.retriever_data.CsvQASrc
    file: {your folder path}/{dataset}/test.tsv

Step2 Training and Inference: Commonsense Reader

Training Data: obtained from the last step
Code: Official FiD code [link]

Step2: FiD Outputs Evaluation

Accuracy is the same as exact match in FiD code.
BLUE, ROUGE is from the CommonGen GitHub repo.
- Some commonly seen issues when installing the lib [link]

Citation

@inproceedings{yu2022retrieval,
  title={Retrieval Augmentation for Commonsense Reasoning: A Unified Approach},
  author={Yu, Wenhao and Zhu, Chenguang and Zhang, Zhihan and Wang, Shuohang and Zhang, Zhuosheng and Fang, Yuwei and Jiang, Meng},
  booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2022}
}

Please kindly cite our paper if you find this paper and the codes helpful.

wyu97/RACo

Code for Retrieval Augmentation for Commonsense Reasoning

Introduction of RACo

Step0 Download the Commonsense Corpus

Step1 Training: Commonsense Retriever

Step1 Inference: Retrieve Documents

Step2 Training and Inference: Commonsense Reader

Step2: FiD Outputs Evaluation

Citation