This repo contains the code and data for causality extraction task.
- python 3.6
- pytorch 1.7.1
- CUDA 10.1
- Transformers 3.5.0
https://drive.google.com/drive/folders/1DGmljtCkvNY2PUzZH94G74agPNyMoKPi?usp=sharing
5 fold splits for FinCausal2020 and FinCausal2021 is given in the above location. Use data_prep.py to convert them into the proper format for our code.
python3.6 data_prep.py in_file_csv out_file_json out_bert_file_json bert_tokenizer_name
in_file_csv: csv file in the splits
out_file_json: intermediate non-beat json file
out_bert_file_json: json file with BERT tokens. This file is used by cepn.py to train and test the model.
bert_tokenizer_name: bert-base-cased or bert-large-cased
CEPN_Base
python3.6 cepn.py FinCausal2020 config.ini target_dir train5fold
python3.6 cepn.py FinCausal2021 config.ini target_dir train5fold
CEPN_Large
python3.6 cepn.py FinCausal2020 config_large.ini target_dir train5fold
python3.6 cepn.py FinCausal2021 config_large.ini target_dir train5fold
target_dir: Some directory where model and other output files are saved.
If you use the source code or models from this work, please cite our paper:
@inproceedings{nayak2022cepn,
author = {Tapas Nayak, Soumya Sharma, Yash Butala, Koustuv Dasgupta, Pawan Goyal, and Niloy Ganguly},
title = {A Generative Approach for Financial Causality Extraction},
booktitle = {Proceedings of The 2nd Workshop on Financial Technology on the Web (FinWeb)},
year = {2022}
}