Pinned Repositories
Adaptive-Finetuning-Attacks
alignment-attribution-code
Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
boyiwei.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
COS598D-Pruning
Assignments for COS598D: System and Machine Learning
cos598d_sp24
CoTaEval
Official code for the paper: Evaluating Copyright Takedown Methods for Language Models
ReG-NAS
RepNoise-Reproduce
tamper-resistance
Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
TAR-Reproduce
boyiwei's Repositories
boyiwei/alignment-attribution-code
Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
boyiwei/CoTaEval
Official code for the paper: Evaluating Copyright Takedown Methods for Language Models
boyiwei/Adaptive-Finetuning-Attacks
boyiwei/ReG-NAS
boyiwei/RepNoise-Reproduce
boyiwei/TAR-Reproduce
boyiwei/boyiwei.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
boyiwei/COS598D-Pruning
Assignments for COS598D: System and Machine Learning
boyiwei/cos598d_sp24
boyiwei/tamper-resistance
Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"