IAAR-Shanghai/Awesome-Attention-Heads

Add new paper: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models

Closed this issue · 0 comments

Title: Attention Tracker: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
Head: Detection Head
Published: arxiv
Summary:

  • Innovation: Demonstrated the existence of shared circuits for similar sequence continuation tasks.
  • Tasks: Analyzed and compared circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.
  • Significant Result: Semantically related sequences rely on shared circuit subgraphs with analogous roles and the finding of similar sub-circuits across models with analogous functionality.