/sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

Primary LanguagePythonMIT LicenseMIT

Stargazers