mo666666

Ph.D. candidate at Peking University

Beijing

mo666666's Stars

openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook26.7k 325 4053.4k
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.5k 218 4692.9k
ytongbai/LVM
Language:Python1.8k 117 2455
MadryLab/photoguard
Raising the Cost of Malicious AI-Powered Image Editing
Language:Jupyter Notebook575 14 545
psyker-team/mist
Watermark you artworks to stay away from unauthorized diffusion style mimicry!
Language:Python323 0 1121
JailbreakBench/jailbreakbench
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
Language:Python260 4 1026
LLM-Tuning-Safety/LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
Language:Python252 4 829
chs20/RobustVLM
[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Language:Python105 6 74
arobey1/smooth-llm
Language:Python86 1 411
chujiezheng/LLM-Safeguard
Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"
Language:Python79 1 97
huanranchen/DiffusionClassifier
Official code implement of Robust Classification via a Single Diffusion Model
Language:Python65 2 43
YuxinWenRick/diffusion_memorization
Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)
Language:Python60 2 67
ys-zong/VLGuard
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
Language:Python47 3 92
sen-mao/SuppressEOT
Official Implementations "Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models" (ICLR2024)
Language:Python41 4 41
TreeLLi/APT
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
Language:Python38 2 53
erfanshayegani/Jailbreak-In-Pieces
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
Language:Python28 3 32
weizeming/SAM_AT
Language:Python21 1 02
SchwinnL/LLM_Embedding_Attack
Code to conduct an embedding attack on LLMs
Language:Python20 2 23
AISG-Technology-Team/GCSS-Track-1A-Submission-Guide
Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).
16 0 192
Jayfeather1024/Backdoor-Enhanced-Alignment
Language:Jupyter Notebook100
Robin-WZQ/T2IShield
[ECCV24] T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
Language:Jupyter Notebook10 1 41
PKU-ML/Diffusion-PID-Protection
Language:Python9 1 11
renjie3/MemAttn
Language:Python8 1 10
PKU-ML/TMLlib
A Trustworthy Machine Learning Algorithm Library
Language:Python7 2 00
Huang-yihao/Personalization-based_backdoor
6 1 00
PKU-ML/TERD
TERD: A Framework for Backdoor Detection on Diffusion Model
Language:Python6 3 02
PKU-ML/ReBAT
Official PyTorch implementation of ReBAT (ReBalanced Adversarial Training) in NeurIPS 2023 "Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective".
Language:Python5 1 02
mo666666/TERD
TERD: A Framework for Backdoor Detection on Diffusion Model
Language:Python20
SPIN-UMass/MeanSparse
Language:Python2 3 01
tenghuilee/ContrastDiffPurification
Language:Python2

mo666666

mo666666's Stars

openai/CLIP

Vision-CAIR/MiniGPT-4

ytongbai/LVM

MadryLab/photoguard

psyker-team/mist

JailbreakBench/jailbreakbench

LLM-Tuning-Safety/LLMs-Finetuning-Safety

chs20/RobustVLM

arobey1/smooth-llm

chujiezheng/LLM-Safeguard

huanranchen/DiffusionClassifier

YuxinWenRick/diffusion_memorization

ys-zong/VLGuard

sen-mao/SuppressEOT

TreeLLi/APT

erfanshayegani/Jailbreak-In-Pieces

weizeming/SAM_AT

SchwinnL/LLM_Embedding_Attack

AISG-Technology-Team/GCSS-Track-1A-Submission-Guide

Jayfeather1024/Backdoor-Enhanced-Alignment

Robin-WZQ/T2IShield

PKU-ML/Diffusion-PID-Protection

renjie3/MemAttn

PKU-ML/TMLlib

Huang-yihao/Personalization-based_backdoor

PKU-ML/TERD

PKU-ML/ReBAT

mo666666/TERD

SPIN-UMass/MeanSparse

tenghuilee/ContrastDiffPurification