Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020).
$ poetry install
- The Yelp dataset is so large that it is divided into subsets in advance.
- After that, we can get
tng.jsonl
,val.jsonl
, andtst.jsonl
fromdata
directory.
- After that, we can get
$ allennlp split-dataset \
--input-file data/yelp_academic_dataset_review.json \
--output-dir data/ \
--tng-ratio 0.8 \
--val-ratio 0.1 \
--tst_ratio 0.1
$ allennlp preprocess-ham-dataset \
--ham-dataset-dir data/ham-dataset/raw_data/ \
--output-dir data/
$ CUDA_VISIBLE_DEVICES=0 allennlp train config/base.jsonnet -s outputs -o '{"trainer": {"cuda_device": 0}}'
- Sen, Cansu, et al. "Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words?." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.