Codes and data for the ACL 2021-Findings paper: CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation
If you have any problem or suggestion, feel free to contact me: chujiezhengchn@gmail.com
If you use our codes or your research is related to our paper, please kindly cite our paper:
@inproceedings{zheng-etal-2021-comae,
title = "CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation",
author = "Zheng, Chujie and
Liu, Yong and
Chen, Wei and
Leng, Yongcai and
Huang, Minlie",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2021",
year = "2021",
}
You can download our propossed data here. However, the released data is a bit different from the used data in our paper.
- Data size. We found that a RoBERTa classifier may suffer from the unbalanced labels. Hence, for all the factors, we instead use BERT as classifiers. As a result, the filtered data based on CM have a larger size than that reported in our paper
- Taxonomies of DA and EM. We modified the adopted taxonomies of both DA and EM (please refer to the json files in this repo) because:
- For DA, we found that suggestion is not categorized as a dialog act of expressed empathy (see the paper of CM). To keep consistent with the CM paper, we merged suggestion with others
- For EM, we modified the taxonomies to reduce the overlaps between different emotions
- Nevertheless, we think you can also modify the taxonomies as needed, and then automatically annotate the utterances
Classifiers | # classes | Acc | F1-macro |
---|---|---|---|
CM-ER | 2 | 80.5 | 76.9 |
CM-IP | 2 | 84.7 | 84.7 |
CM-EX | 2 | 96.8 | 93.6 |
DA | 8 | 91.4 | 85.9 |
EM | 9 | 65.8 | 62.8 |
Train | Valid | Test-Happy | Test-Offmychest |
---|---|---|---|
154001 | 19940 | 13337 | 7827 |
Please enter codes
.