Pinned Repositories
redwyd
redwyd.github.io
TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
AutoPoison
The official repository of the paper "On the Exploitability of Instruction Tuning".
Poisoning-Instruction-Tuned-Models
llmprivacy
RAG-privacy
The code for paper "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)", exploring the privacy risk on RAG.
Notable
Layer-Weight-Poison
Code for EMNLP 2021 paper: Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning