Add new paper: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models

Question

Add new paper: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models

Closed this issue 2 months ago · 0 comments

Title: Attention Tracker: Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
Head: Detection Head
Published: arxiv
Summary:

Innovation: Demonstrated the existence of shared circuits for similar sequence continuation tasks.
Tasks: Analyzed and compared circuits for similar sequence continuation tasks, which include increasing sequences of Arabic numerals, number words, and months.
Significant Result: Semantically related sequences rely on shared circuit subgraphs with analogous roles and the finding of similar sub-circuits across models with analogous functionality.