Pinned Repositories
attack-via-GCG
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
auto-dan
The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".
instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
length-generalization
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
Math_LLM
Code for MIT 6.8610 project
probing
Program-of-Thoughts
Data and Code for Program of Thoughts (TMLR 2023)
rag-privacy
representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
zhentingqi's Repositories
zhentingqi/attack-via-GCG
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
zhentingqi/auto-dan
The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".
zhentingqi/instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
zhentingqi/length-generalization
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
zhentingqi/lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
zhentingqi/Math_LLM
Code for MIT 6.8610 project
zhentingqi/probing
zhentingqi/Program-of-Thoughts
Data and Code for Program of Thoughts (TMLR 2023)
zhentingqi/rag-privacy
zhentingqi/representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
zhentingqi/ushape
zhentingqi/zhentingqi-past.github.io
Personal Webpage
zhentingqi/zhentingqi.github.io