Pinned Repositories
AutoDAN
The official implementation of our paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".
Deeplearning-Lab4
LSTM, Text Classification, Chinese Word Embeddings, Rnn Encoder-Decoder, Temperature Series Prediction
Gradient-Cuff
Code for our NeurIPS2024 accepted paper: Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
llm_attack
OpenMatch
An Open-Source Package for Information Retrieval.
promptbench
A robustness evaluation framework for large language models on adversarial prompts
Unified-Prompt-based-Framework-for-Multi-task-training
T5 Prompt Tuning
Gradient-Cuff
Repo for NeurIPS 2024 paper "Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes"
RADAR
Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.
P3Ranker
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning
GregxmHu's Repositories
GregxmHu/Unified-Prompt-based-Framework-for-Multi-task-training
T5 Prompt Tuning
GregxmHu/Gradient-Cuff
Code for our NeurIPS2024 accepted paper: Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
GregxmHu/Deeplearning-Lab4
LSTM, Text Classification, Chinese Word Embeddings, Rnn Encoder-Decoder, Temperature Series Prediction
GregxmHu/llm_attack
GregxmHu/promptbench
A robustness evaluation framework for large language models on adversarial prompts
GregxmHu/AutoDAN
The official implementation of our paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".
GregxmHu/ConvDR
Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
GregxmHu/CSrankings-1
A web app for ranking computer science departments according to their research output in selective venues.
GregxmHu/Deeplearning-Lab3
VGG, Resnet, SE-Block.
GregxmHu/detect-gpt
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
GregxmHu/gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
GregxmHu/gpt-2-output-dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
GregxmHu/GregxmHu.github.io
GregxmHu/IO-FP
Intelligent Optimization - Final Project
GregxmHu/JavaFamily
【Java面试+Java学习指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。
GregxmHu/KB-QA
KB-QA, Knowledge Graph. Neo-4j, Jena , Graph Database
GregxmHu/latent-jailbreak
GregxmHu/llm-sp
Papers and resources related to the security and privacy of LLMs 🤖
GregxmHu/loss-landscape
Code for visualizing the loss landscape of neural nets
GregxmHu/ml-visuals
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
GregxmHu/my_dr
Dense Retrieval pipeline grounding with sentence-transformers.
GregxmHu/P3Ranker
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning
GregxmHu/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
GregxmHu/Quick-BM25
GregxmHu/sgpt
SGPT: GPT Sentence Embeddings for Semantic Search
GregxmHu/TAP
TAP: An automated jailbreaking method for black-box LLMs
GregxmHu/tevatron
Tevatron - A flexible toolkit for dense retrieval research and development.
GregxmHu/TextGAIL
GregxmHu/TransE
Implementation of TransE (nips 2013), and its simple usage
GregxmHu/VecJBDet