yjw1029/Self-Reminder
Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.
PythonGPL-3.0
Stargazers
- 2000ZRLImperial College London
- 2003proHong Kong University of Science and Technology
- andyz245
- chrisyxueUESTC
- dahua966
- feizhihuiTencent
- fly51flyPRIS
- HeegyuKimSeoul, Korea
- HustcwVUL337-NISL@THU
- hyoer0423ZJUI BA & UIUC MS
- Isaac-theori
- LeezekunSanta Barbara, CA
- limacvNetflix Eyeline Studios
- mancevd
- meiling-fduShanghai
- MiZhenxingHK, China
- nurlanov-zhUniversity of Bonn
- OpdoopCASIA
- pipilurjHongkong
- prismformoreHKUST
- qinliu9
- rookiehbUIUC
- shaojiawei07
- shizhediaoNVIDIA
- tangminji
- tbozhong
- Tianjoker
- W-Ted
- weizemingPeking University
- xszheng2020
- xyq7HKUST
- Yang-Yan-Yang-Yan
- yehui1234
- Yhzeve
- yjw1029USTC & MSRA
- Zheng-Lu@NullPointer-Network