/CoMAE

Codes and data for the ACL 2021-Findings paper: CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation

Primary LanguagePython

CoMAE

Codes and data for the ACL 2021-Findings paper: CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation

If you have any problem or suggestion, feel free to contact me: chujiezhengchn@gmail.com

If you use our codes or your research is related to our paper, please kindly cite our paper:

@inproceedings{zheng-etal-2021-comae,
    title = "CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation",
    author = "Zheng, Chujie  and
      Liu, Yong  and
      Chen, Wei  and
      Leng, Yongcai  and
      Huang, Minlie",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2021",
    year = "2021",
}

Data

You can download our propossed data here. However, the released data is a bit different from the used data in our paper.

  • Data size. We found that a RoBERTa classifier may suffer from the unbalanced labels. Hence, for all the factors, we instead use BERT as classifiers. As a result, the filtered data based on CM have a larger size than that reported in our paper
  • Taxonomies of DA and EM. We modified the adopted taxonomies of both DA and EM (please refer to the json files in this repo) because:
    • For DA, we found that suggestion is not categorized as a dialog act of expressed empathy (see the paper of CM). To keep consistent with the CM paper, we merged suggestion with others
    • For EM, we modified the taxonomies to reduce the overlaps between different emotions
    • Nevertheless, we think you can also modify the taxonomies as needed, and then automatically annotate the utterances

Performance of BERT-based classifiers

Classifiers # classes Acc F1-macro
CM-ER 2 80.5 76.9
CM-IP 2 84.7 84.7
CM-EX 2 96.8 93.6
DA 8 91.4 85.9
EM 9 65.8 62.8

Data Size

Train Valid Test-Happy Test-Offmychest
154001 19940 13337 7827

Model Implementation

Please enter codes.