danielmamay's Stars
tml-epfl/llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]
rmcelreath/stat_rethinking_2024
openai/sparse_autoencoder
callummcdougall/sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
jbloomAus/SAELens
Training Sparse Autoencoders on Language Models
microsoft/Samba
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
ridgerchu/matmulfreellm
Implementation for MatMul-free LM.
google-research/arco-era5
Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.
google-research/weatherbench2
A benchmark for the next generation of data-driven global weather models.
iterative/dvc
🦉 Data Versioning and ML Experiments
avehtari/ROS-Examples
Regression and other stories R examples
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
ndif-team/nnsight
The nnsight package enables interpreting and manipulating the internals of deep learned models.
thestephencasper/latent_adversarial_training
openai/simple-evals
UKGovernmentBEIS/inspect_ai
Inspect: A framework for large language model evaluations
andyzoujm/representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
meta-llama/llama3
The official Meta Llama 3 GitHub site
centerforaisafety/HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
METR/task-standard
METR Task Standard
google-deepmind/graphcast
carlini/yet-another-applied-llm-benchmark
A benchmark to evaluate language models on questions I've previously asked them to solve.
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
dalibo/pev2
Postgres Explain Visualizer 2
jepsen-io/jepsen
A framework for distributed systems verification, with fault injection
nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
pyro-ppl/pyro
Deep universal probabilistic programming with Python and PyTorch
openai/weak-to-strong