/CAA_hallucination

Public reposetory for code and results of parts of "Steering Llama 2 via Contrastive Activation Addition" by Rimsky, Gabrieli, Schulz et al.

Primary LanguagePython

Stargazers