danielmamay

London

danielmamay's Stars

tml-epfl/llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]
Language:Shell22822
rmcelreath/stat_rethinking_2024
Language:R1.3k133
openai/sparse_autoencoder
Language:Python35938
callummcdougall/sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
Language:HTML16834
jbloomAus/SAELens
Training Sparse Autoencoders on Language Models
Language:Jupyter Notebook507129
microsoft/Samba
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Language:Python80949
ridgerchu/matmulfreellm
Implementation for MatMul-free LM.
Language:Python2.9k184
google-research/arco-era5
Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.
Language:Python33223
google-research/weatherbench2
A benchmark for the next generation of data-driven global weather models.
Language:Python45648
iterative/dvc
🦉 Data Versioning and ML Experiments
Language:Python14k1.2k
avehtari/ROS-Examples
Regression and other stories R examples
Language:HTML325256
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Language:Python2.2k185
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook1.6k245
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Language:Jupyter Notebook10.1k824
ndif-team/nnsight
The nnsight package enables interpreting and manipulating the internals of deep learned models.
Language:Jupyter Notebook41740
thestephencasper/latent_adversarial_training
Language:Python195
openai/simple-evals
Language:Python2k173
UKGovernmentBEIS/inspect_ai
Inspect: A framework for large language model evaluations
Language:Python647128
andyzoujm/representation-engineering
Representation Engineering: A Top-Down Approach to AI Transparency
Language:Jupyter Notebook73688
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.4k3.1k
centerforaisafety/HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Language:Jupyter Notebook35359
METR/task-standard
METR Task Standard
Language:TypeScript12731
google-deepmind/graphcast
Language:Python4.8k596
carlini/yet-another-applied-llm-benchmark
A benchmark to evaluate language models on questions I've previously asked them to solve.
Language:Python91866
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python9.2k870
dalibo/pev2
Postgres Explain Visualizer 2
Language:TypeScript2.7k131
jepsen-io/jepsen
A framework for distributed systems verification, with fault injection
Language:Clojure6.9k719
nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
Language:Python9.6k1.3k
pyro-ppl/pyro
Deep universal probabilistic programming with Python and PyTorch
Language:Python8.6k988
openai/weak-to-strong
Language:Python2.5k308

danielmamay

danielmamay's Stars

tml-epfl/llm-adaptive-attacks

rmcelreath/stat_rethinking_2024

openai/sparse_autoencoder

callummcdougall/sae_vis

jbloomAus/SAELens

microsoft/Samba

ridgerchu/matmulfreellm

google-research/arco-era5

google-research/weatherbench2

iterative/dvc

avehtari/ROS-Examples

eric-mitchell/direct-preference-optimization

tatsu-lab/alpaca_eval

artidoro/qlora

ndif-team/nnsight

thestephencasper/latent_adversarial_training

openai/simple-evals

UKGovernmentBEIS/inspect_ai

andyzoujm/representation-engineering

meta-llama/llama3

centerforaisafety/HarmBench

METR/task-standard

google-deepmind/graphcast

carlini/yet-another-applied-llm-benchmark

karpathy/minbpe

dalibo/pev2

jepsen-io/jepsen

nerfstudio-project/nerfstudio

pyro-ppl/pyro

openai/weak-to-strong