Enhancing Offensive Language Detection with Data Augmentation and Knowledge Distillation
AugCOLD (Augmented Chinese Offensive Language Dataset) is a large-scale unsupervised dataset, containing 1 million samples gathered by data crawling and model generation.
Please kindly cite our paper if this paper and the dataset are helpful.
@article{deng2023Augcold,
author = {Jiawen Deng and Zhuang Chen and Hao Sun and Zhexin Zhang and Jincenzi Wu and Satoshi Nakagawa and Fuji Ren and Minlie Huang },
title = {Enhancing Offensive Language Detection with Data Augmentation and Knowledge Distillation},
journal = {Research},
volume = {6},
number = {},
pages = {0189},
year = {2023},
doi = {10.34133/research.0189},
URL = {https://spj.science.org/doi/abs/10.34133/research.0189},
eprint = {https://spj.science.org/doi/pdf/10.34133/research.0189}
}