/elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.

Primary LanguagePythonMIT LicenseMIT

Watchers