jsksxs360/bin2ckpt

magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

kusumlata123 opened this issue · 5 comments

python convert.py
Traceback (most recent call last):
File "/home/dr/Desktop/Hindi-coref/convert.py", line 91, in
convert(bin_path, bin_model, ckpt_path, ckpt_model)
File "/home/dr/Desktop/Hindi-coref/convert.py", line 80, in convert
state_dict=torch.load(os.path.join(pytorch_bin_path, pytorch_bin_model), map_location='cpu')
File "/home/dr/anaconda3/envs/trial/lib/python3.9/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/dr/anaconda3/envs/trial/lib/python3.9/site-packages/torch/serialization.py", line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

It seems that you are converting the MuRIL model. I suggest you directly download the model files from Huggingface's Model Hub (google/muril-base-cased | google/muril-large-cased), including

  • config.json,
  • pytorch_model.bin,
  • special_tokens_map.json,
  • tokenizer_config.json and
  • vocab.txt.

Note that you should download the corresponding bin file pytorch_model.bin, not the h5 file.

but i have downloaded this which has
config.json,
pytorch_model.bin,
special_tokens_map.json,
tokenizer_config.json and
vocab.txt
still getting this error
can you help me in this, why it is happening.

I tried converting the pytorch_model.bin of google/muril-base-cased and the script works fine.

Screen Shot 2022-02-05 at 10 08 09 AM

if __name__ == '__main__':
    bin_path = './muril/pytorch_model/'
    bin_model = 'pytorch_model.bin'
    ckpt_path = './muril/tensorflow_model/'
    ckpt_model = 'bert_model.ckpt'

    convert(bin_path, bin_model, ckpt_path, ckpt_model)
Successfully created bert/embeddings/position_ids: True
Successfully created bert/embeddings/word_embeddings: True
Successfully created bert/embeddings/position_embeddings: True
Successfully created bert/embeddings/token_type_embeddings: True
Successfully created bert/embeddings/LayerNorm/gamma: True
Successfully created bert/embeddings/LayerNorm/beta: True
...
Successfully created bert/encoder/layer_11/output/LayerNorm/gamma: True
Successfully created bert/encoder/layer_11/output/LayerNorm/beta: True
Successfully created bert/pooler/dense/kernel: True
Successfully created bert/pooler/dense/bias: True

Screen Shot 2022-02-05 at 10 13 13 AM

Could you describe in detail the problem you met?

could you tell me which pyhton verion are you using (python2 or python 3)
bin_path = '/home/dr/Desktop/Hindi-coref/MuRLI/muril-large-cased'
bin_path='/home/dr/muril-large-cased'
bin_model = 'pytorch_model.bin'
ckpt_path = '/home/dr/Desktop/Hindi-coref/MuRLI/muril-base-casedtensorflow_model/'
ckpt_model = 'bert_model.ckpt'

convert(bin_path, bin_model, ckpt_path, ckpt_model)

when i run it then got
Traceback (most recent call last):
File "/home/dr/Desktop/Hindi-coref/convert.py", line 93, in
convert(bin_path, bin_model, ckpt_path, ckpt_model)
File "/home/dr/Desktop/Hindi-coref/convert.py", line 81, in convert
state_dict=torch.load(os.path.join(pytorch_bin_path, pytorch_bin_model), map_location='cpu')
File "/home/dr/anaconda3/envs/trial/lib/python3.9/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/dr/anaconda3/envs/trial/lib/python3.9/site-packages/torch/serialization.py", line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

Python 3.8.2
tensorflow==2.7.0
torch==1.10.0
transformers==4.12.5