/ToxificationReversal

Code for the paper "Self-Detoxifying Language Models via Toxification Reversal" (EMNLP 2023)

Primary LanguagePython

Stargazers