BeyonderXX/ShadowAlignment
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
PythonApache-2.0
Stargazers
- BeyonderXXFudan University
- cmnfriend
- csHuangfdu
- huizhang-L
- IceSolitaryFudan University
- InvokerStark
- JeffCarpenterCanada
- jlwu002Washington University in St. Louis
- LetheSecUniversity of Science and Technology of China
- LuckySJTUShanghai Jiao Tong University
- mexiQQShadow LLM Guardians
- Nevaeh7Wuhan University
- ntudyNanyang Technological University
- RorschachChen
- Spico197Soochow University
- WalterSumbon
- wangruihui0429
- xiami2019Fudan University&Sun Yat-Sen University
- Xianjun-Yang
- XiaoWangNLP
- zgjusst
- zhaohan-xiBinghamton University