EleutherAI/gpt-neo

Issue loading gpt-neo-125M from checkpoint

ixn872 opened this issue · 0 comments

Describe the bug
This error is the same on both google collab with GPU and locally on CPU. It cannot load the pre-trained model from the repo
https://huggingface.co/EleutherAI/gpt-neo-125M

To Reproduce
Steps to reproduce the behaviour:

Clone locally : git clone https://huggingface.co/EleutherAI/gpt-neo-125M
Run the python script:

from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoConfig

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M",from_tf=True)

Expected behavior
Load the pretrained model I cloned from !git clone https://huggingface.co/EleutherAI/gpt-neo-125M

Proposed solution
The model is stored in a .bin format. Perhaps store it as a .h5?

Screenshots
image
image

Environment :

  • Configs: Default from GitHub repo
  • transformers version: 4.10.0.dev0
  • Platform: Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.29
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.9.0+cpu (False)
  • Tensorflow version (GPU?): 2.5.0 (False)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Additional context

In case you get the error

AttributeError: module transformers has no attribute TFGPTNeoForCausalLM

Remove the "TF"+ from the line and file the terminal will show as an error because the proper module name is GPTNeoForCausalLM