Add new paper: In-Context Language Learning: Architectures and Algorithms
Closed this issue · 1 comments
wyzh0912 commented
Title: In-Context Language Learning: Architectures and Algorithms
Head: n-gram head(Induction head)
Published: ICML
Summary:
- Innovation: Provided evidence that their ability to do ICLL tasks relies on specialized “n-gram heads” that compute input-conditional next-token distributions.
- Tasks: Using three complementary strategies—attention visualization, probing hidden representations, and black-box input–output analysis—to analyze the behavior of Transformers trained for ICLL tasks.
- Result: Demonstrated that inserting simple induction heads for n-gram modeling into neural architectures significantly improves their ICLL performance as well as their language modeling abilities in natural data.
fan2goa1 commented
Already included