/awesome-sparse-autoencoders

A resource repository of sparse autoencoders for large language models

Apache License 2.0Apache-2.0

Awesome Sparse Autoencoders

Awesome GitHub stars GitHub forks GitHub issues GitHub Last commit

This repository tracks the latest research on sparse autoencoders, specifically used for mechanistic interpretability. The goal is to offer a comprehensive list of papers and resources relevant to the topic.

Note

If you believe your paper, blog post, or other resources on sparse autoencoders are not included, or if you find a mistake, typo, or outdated information, please open an issue or submit a pull request. I will be happy to update the list.

Papers

Blog Posts