Pinned Repositories
Certified-Robustness-SoK-Oldver
This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular datasets and paper categorization.
CRFL
CRFL: Certifiably Robust Federated Learning against Backdoor Attacks (ICML 2021)
DBA
DBA: Distributed Backdoor Attacks against Federated Learning (ICLR 2020)
DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
FLBenchmark-toolkit
Federated Learning Framework Benchmark (UniFed)
InfoBERT
[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu
Meta-Nerual-Trojan-Detection
multi-task-learning
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.
SemanticAdv
VeriGauge
A united toolbox for running major robustness verification approaches for DNNs. [S&P 2023]
AI Secure's Repositories
AI-secure/DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
AI-secure/Certified-Robustness-SoK-Oldver
This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular datasets and paper categorization.
AI-secure/VeriGauge
A united toolbox for running major robustness verification approaches for DNNs. [S&P 2023]
AI-secure/InfoBERT
[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu
AI-secure/FLBenchmark-toolkit
Federated Learning Framework Benchmark (UniFed)
AI-secure/Robustness-Against-Backdoor-Attacks
RAB: Provable Robustness Against Backdoor Attacks
AI-secure/aug-pe
[ICML 2024] Differentially Private Synthetic Data via Foundation Model APIs 2: Text
AI-secure/Transferability-Reduced-Smooth-Ensemble
AI-secure/semantic-randomized-smoothing
[CCS 2021] TSS: Transformation-specific smoothing for robustness certification
AI-secure/SemAttack
[NAACL 2022] "SemAttack: Natural Textual Attacks via Different Semantic Spaces" by Boxin Wang, Chejian Xu, Xiangyu Liu, Yu Cheng, Bo Li
AI-secure/adversarial-glue
[NeurIPS 2021] "Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models" by Boxin Wang*, Chejian Xu*, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li.
AI-secure/CoPur
CoPur: Certifiably Robust Collaborative Inference via Feature Purification (NeurIPS 2022)
AI-secure/CROP
[ICLR 2022] CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing
AI-secure/COPA
[ICLR 2022] COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks
AI-secure/MMDT
Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models
AI-secure/AdvWeb
AI-secure/DPFL-Robustness
[CCS 2023] Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks
AI-secure/TextGuard
TextGuard: Provable Defense against Backdoor Attacks on Text Classification
AI-secure/Certified-Fairness
[NeurIPS 2022] Code for Certifying Some Distributional Fairness with Subpopulation Decomposition
AI-secure/FedGame
Official implementation for paper "FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning" (NeurIPS 2023).
AI-secure/Layerwise-Orthogonal-Training
AI-secure/SecretGen
A general model inversion attack against large pre-trained models.
AI-secure/COPA_Atari
AI-secure/DMLW2022
AI-secure/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
AI-secure/transferability-versus-robustness
AI-secure/COPA_Highway
AI-secure/DecodingTrust-Data-Legacy
AI-secure/hf-blog
Public repo for HF blog posts
AI-secure/VFL-ADMM
Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM (SaTML 2024)