JCommonsenseMorality is a dataset created through crowdsourcing that reflects the commonsense morality of Japanese annotators.
Dataset is available at data.
Code for fine-tuning is available at finetune.py.
You can read our paper (japanese) at here.
Examples
sentence | label |
---|---|
賽銭箱に石を投げ入れる。(throw a stone in to a offertory box) | 1 (wrong) |
賽銭箱にお賽銭を投げ入れる。(throw money in to a offertory box) | 0 (permissible) |
限定商品を買い占めて転売する。(buy up limited edition items and resell them) | 1 (wrong) |
限定商品を自分の分だけ買う。(buy limited edition items for myself) | 0 (permissible) |
All sentences are labeled either '1' or '0', indicating that the described action is clearly morally wrong or permissible, respectively.
Data Statistics
Train | Dev | Test | Total |
---|---|---|---|
13,975 | 1,996 | 3,992 | 19,963 |
All results are average scores of the model trained on five random seeds.
Model | acc | pre | rec | f1 |
---|---|---|---|---|
Tohoku BERT base | 0.7836 | 0.7740 | 0.7601 | 0.7664 |
Tohoku BERT large | 0.8033 | 0.8050 | 0.7691 | 0.7860 |
Waseda RoBERTa large | 0.8558 | 0.8453 | 0.8481 | 0.8461 |
This work is licensed under a MIT License. https://github.com/Language-Media-Lab/commonsense-moral-ja/blob/main/LICENSE
Japanese
@InProceedings{Takeshita_nlp2023,
author = "竹下昌志 and ジェプカラファウ and 荒木健治",
title = "JCommonsenseMorality: 常識道徳の理解度評価用日本語データセット",
booktitle = "言語処理学会第29回年次大会",
year = "2023",
pages = "357-362",
url = "https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/D2-1.pdf",
note= "in Japanese"
}
English (translated)
@InProceedings{Takeshita_nlp2023,
author = "Masashi Takeshita and Rafal Rzpeka and Kenji Araki",
title = "JCommonsenseMorality: Japanese Dataset for evaluating commonsense morality understanding",
booktitle = "In Proceedings of The Twenty Nineth Annual Meeting of The Association for Natural Language Processing (NLP2023)",
year = "2023",
pages = "357-362",
url = "https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/D2-1.pdf",
note= "in Japanese"
}
This work was supported by JSPS KAKENHI Grant Number JP22J21160.