Add new paper: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
Closed this issue · 0 comments
wyzh0912 commented
Title: Attention Tracker: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
Head: Detection Head
Published: arxiv
Summary:
- Innovation: Demonstrated the existence of shared circuits for similar sequence continuation tasks.
- Tasks: Analyzed and compared circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.
- Significant Result: Semantically related sequences rely on shared circuit subgraphs with analogous roles and the finding of similar sub-circuits across models with analogous functionality.