danbraunai-apollo's Stars
noanabeshima/tinymodel
A TinyStories LM with SAEs and transcoders
timothee-chauvin/eyeballvul
future-proof vulnerability detection benchmark, based on CVEs in open-source repos
hijohnnylin/automated-interpretability
jbloomAus/SAELens
Training Sparse Autoencoders on Language Models
ai-safety-foundation/sparse_autoencoder
Sparse Autoencoder for Mechanistic Interpretability