nli0
ML evaluations and safety @scaleapi @centerforaisafety, CS @ucberkeley. nli0.github.io
@ucberkeleySan Francisco, CA
nli0's Stars
jdholtz/auto-southwest-check-in
A Python script that automatically checks in to your Southwest flight 24 hours beforehand.
scaleapi/browser-art
huggingface/evaluation-guidebook
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
baceolus/BioLP-bench
Benchmark for evaluating capabilities of AI models to understand biological lab protocols
magic-wormhole/magic-wormhole
get things from one computer to another, safely
rohan-paul/LLM-FineTuning-Large-Language-Models
LLM (Large Language Model) FineTuning
UKGovernmentBEIS/inspect_ai
Inspect: A framework for large language model evaluations
GraySwanAI/circuit-breakers
Improving Alignment and Robustness with Circuit Breakers
andyzoujm/representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
centerforaisafety/HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
centerforaisafety/wmdp
WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.
fferflo/einx
Universal Tensor Operations in Einstein-Inspired Notation for Python.
aypan17/machiavelli
centerforaisafety/Intro_to_ML_Safety