Issue loading gpt-neo-125M from checkpoint

Question

Issue loading gpt-neo-125M from checkpoint

ixn872 opened this issue 3 years ago · 0 comments

Describe the bug
This error is the same on both google collab with GPU and locally on CPU. It cannot load the pre-trained model from the repo
https://huggingface.co/EleutherAI/gpt-neo-125M

To Reproduce
Steps to reproduce the behaviour:

Clone locally : git clone https://huggingface.co/EleutherAI/gpt-neo-125M
Run the python script:

from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoConfig

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)

Expected behavior
Load the pretrained model I cloned from !git clone https://huggingface.co/EleutherAI/gpt-neo-125M

Proposed solution
The model is stored in a .bin format. Perhaps store it as a .h5?

Screenshots

Environment :

Configs: Default from GitHub repo
transformers version: 4.10.0.dev0
Platform: Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.29
Python version: 3.8.5
PyTorch version (GPU?): 1.9.0+cpu (False)
Tensorflow version (GPU?): 2.5.0 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Additional context

In case you get the error

AttributeError: module transformers has no attribute TFGPTNeoForCausalLM

Remove the "TF"+ from the line and file the terminal will show as an error because the proper module name is GPTNeoForCausalLM