Pinned Repositories
automated-interpretability
axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
ForumMagnum
The development repository for LessWrong2 and the EA Forum, based on Vulcan JS
mats_sae_training
Training Sparse Autoencoders on Language Models
neuronpedia-docs
neuronpedia-python
Python Library for Neuronpedia API
neuronpedia-scorer
sae-auto-interp
sae_vis
sparse_autoencoder
Clone of OAI Sparse Autoencoder, specifically to remove version requirements
hijohnnylin's Repositories
hijohnnylin/neuronpedia-scorer
hijohnnylin/automated-interpretability
hijohnnylin/mats_sae_training
Training Sparse Autoencoders on Language Models
hijohnnylin/neuronpedia-python
Python Library for Neuronpedia API
hijohnnylin/sae_vis
hijohnnylin/axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
hijohnnylin/ForumMagnum
The development repository for LessWrong2 and the EA Forum, based on Vulcan JS
hijohnnylin/neuronpedia-docs
hijohnnylin/sae-auto-interp
hijohnnylin/sparse_autoencoder
Clone of OAI Sparse Autoencoder, specifically to remove version requirements
hijohnnylin/transcoder_circuits
hijohnnylin/TransformerLens
A library for mechanistic interpretability of GPT-style language models