nli0

ML evaluations and safety @scaleapi @centerforaisafety, CS @ucberkeley. nli0.github.io

@ucberkeleySan Francisco, CA

Pinned Repositories

machiavelli
Language:Python125 4 1223
Intro_to_ML_Safety
65 4 019
wmdp
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.
Language:Jupyter Notebook89 1 1226
AgentSocieties
Language:Python00
ethics
Aligning AI With Shared Human Values (ICLR 2021)
Language:Python263 9 744
coup_environment
A pettingzoo environment for the card game "Coup".
Language:Python00
ethics
Aligning AI With Shared Human Values (ICLR 2021)
Language:Python0 0 00
Intro_to_ML_Safety
00
partisan-gerrymanders
One Way to Spot More Partisan Gerrymanders
Language:Jupyter Notebook0 1 00

nli0's Repositories

nli0/coup_environment
A pettingzoo environment for the card game "Coup".
Language:Python00
nli0/ethics
Aligning AI With Shared Human Values (ICLR 2021)
Language:Python0 0 00
nli0/Intro_to_ML_Safety
00
nli0/partisan-gerrymanders
One Way to Spot More Partisan Gerrymanders
Language:Jupyter Notebook0 1 00