EleutherAI/delphi

[Experiments] - Correlate activation distributions and explanation scores.

Closed this issue · 0 comments

  • See (vibe-check) if features that have low explanation scores have different activation distributions that features that have high explanation scores