/honest_llama

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

Primary LanguagePythonMIT LicenseMIT

Stargazers

No one’s star this repository yet.