IAAR-Shanghai/Awesome-Attention-Heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
TeX
Issues
- 0
Add new paper: Understanding Knowledge Hijack Mechanism in In-context Learning through Associative Memory
#37 opened by wyzh0912 - 0
- 1
- 0
Add new paper: Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
#31 opened by wyzh0912 - 0
Add new paper: Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers
#30 opened by wyzh0912 - 0
- 0
- 0
Add new paper: Algorithmic Phase Transitions in Language Models: A Mechanistic Case Study of Arithmetic
#35 opened by wyzh0912 - 0
Add new paper: Adaptive Circuit Behavior and Generalization in Mechanistic Interpretability
#34 opened by fan2goa1 - 0
- 0
Add new paper: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
#28 opened by wyzh0912 - 0
- 0
Add new paper: Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
#23 opened by wyzh0912 - 0
Add new paper: Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
#25 opened by wyzh0912 - 0
- 0
Add new paper: Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
#21 opened by wyzh0912 - 0
Add new paper: A Psycholinguistic Evaluation of Language Models’ Sensitivity to Argument Roles
#22 opened by wyzh0912 - 0
- 0
Add new paper: Round and Round We Go! What makes Rotary Positional Encodings useful?
#16 opened by wyzh0912 - 0
Add new paper: The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
#17 opened by wyzh0912 - 0
Add new paper: DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
#18 opened by wyzh0912 - 0
- 0
Add new paper: How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning
#15 opened by wyzh0912 - 0
Add new paper: Revisiting In-context Learning Inference Circuit in Large Language Models
#14 opened by wyzh0912 - 0
Add new paper: Interpreting and Improving Large Language Models in Arithmetic Calculation
#13 opened by wyzh0912 - 0
Add new paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
#12 opened by fan2goa1 - 1
Might be a better form of presentation.
#10 opened by Ki-Seki - 2
New Heads To Add
#2 opened by Ki-Seki