Pinned Repositories
Alisa
ALiSa: Acrostic Linguistic Steganography Based on BERT and Gibbs Sampling
BadActs
BEAT
Revisiting-NLP-Backdoor
LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
I-GCG
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
AmpleGCG
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
learning_research
本人的科研经验
BEEAR
This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models".
detection_logits
clearloveclearlove's Repositories
clearloveclearlove/Alisa
ALiSa: Acrostic Linguistic Steganography Based on BERT and Gibbs Sampling
clearloveclearlove/BadActs
clearloveclearlove/BEAT
clearloveclearlove/Revisiting-NLP-Backdoor