The conversion script doesn’t work
StellaAthena opened this issue · 2 comments
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
- Run conversion script
- Load results into the HuggingFace transformers library
- Feed it a context of 450 tokens and then have it generate another 200
- Observe that around the 500th token the coherency falls off a cliff
Expected behavior
Performance should not jump off a cliff
Proposed solution
It appears that the problem is the lack of compatibility between the local attention function used in GPT-Neo and the transformers
library. While the transformers
library does include models with local attention (longformer, for example) it’s not consistent with how the GPT-2 model is defined in the transformers
library.
Screenshots
n/a
Environment (please complete the following information):
- GPUs: v3-8s, Ti1080s, A100s
- Configs: any config that has local attention
Additional context
Add any other context about the problem here.
The amazing @patil-suraj and @LysandreJik have a preliminary PR for a HF implementation!
It's live on HF!