Issue loading gpt-neo-125M from checkpoint
ixn872 opened this issue · 0 comments
Describe the bug
This error is the same on both google collab with GPU and locally on CPU. It cannot load the pre-trained model from the repo
https://huggingface.co/EleutherAI/gpt-neo-125M
To Reproduce
Steps to reproduce the behaviour:
Clone locally : git clone https://huggingface.co/EleutherAI/gpt-neo-125M
Run the python script:
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoConfig
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)
Expected behavior
Load the pretrained model I cloned from !git clone https://huggingface.co/EleutherAI/gpt-neo-125M
Proposed solution
The model is stored in a .bin format. Perhaps store it as a .h5?
Environment :
- Configs: Default from GitHub repo
transformers
version: 4.10.0.dev0- Platform: Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.29
- Python version: 3.8.5
- PyTorch version (GPU?): 1.9.0+cpu (False)
- Tensorflow version (GPU?): 2.5.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Additional context
In case you get the error
AttributeError: module transformers has no attribute TFGPTNeoForCausalLM
Remove the "TF"+ from the line and file the terminal will show as an error because the proper module name is GPTNeoForCausalLM