allenai/unifiedqa

Unable to load models in huggingface, tf throws DataLossError

tshrjn opened this issue · 1 comments

Using the model loading code provided in the Readme, shown below, unable to load models as facing DataLossError.

Code:

from transformers import T5Config, T5Tokenizer, T5ForConditionalGeneration
from transformers.modeling_t5 import load_tf_weights_in_t5

base_model = "t5-small"
tokenizer = T5Tokenizer.from_pretrained(base_model)
model = T5ForConditionalGeneration(T5Config.from_pretrained(base_model))

load_tf_weights_in_t5(model, None, "./models/unifiedqa-small/")

Error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in NewCheckpointReader(filepattern)
     94   try:
---> 95     return CheckpointReader(compat.as_bytes(filepattern))
     96   # TODO(b/143319754): Remove the RuntimeError casting logic once we resolve the

RuntimeError: Unable to open table file /content/base: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

During handling of the above exception, another exception occurred:

DataLossError                             Traceback (most recent call last)
5 frames
<ipython-input-27-b28dfb350abf> in <module>()
      7 
      8 model_path = './base/' #@param ['./unifiedqa-base/', './base/']
----> 9 load_tf_weights_in_t5(model, None, model_path)
     10 
     11 # tokenizer = T5Tokenizer.from_pretrained('t5-base')

/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py in load_tf_weights_in_t5(model, config, tf_checkpoint_path)
     78     logger.info("Converting TensorFlow checkpoint from {}".format(tf_path))
     79     # Load weights from TF model
---> 80     init_vars = tf.train.list_variables(tf_path)
     81     names = []
     82     tf_weights = {}

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py in list_variables(ckpt_dir_or_file)
     96     List of tuples `(name, shape)`.
     97   """
---> 98   reader = load_checkpoint(ckpt_dir_or_file)
     99   variable_map = reader.get_variable_to_shape_map()
    100   names = sorted(variable_map.keys())

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py in load_checkpoint(ckpt_dir_or_file)
     65     raise ValueError("Couldn't find 'checkpoint' file or checkpoints in "
     66                      "given directory %s" % ckpt_dir_or_file)
---> 67   return py_checkpoint_reader.NewCheckpointReader(filename)
     68 
     69 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in NewCheckpointReader(filepattern)
     97   # issue with throwing python exceptions from C++.
     98   except RuntimeError as e:
---> 99     error_translator(e)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in error_translator(e)
     42     raise errors_impl.InvalidArgumentError(None, None, error_message)
     43   elif 'Unable to open table file' in error_message:
---> 44     raise errors_impl.DataLossError(None, None, error_message)
     45   elif 'Failed to find the saved tensor slices' in error_message:
     46     raise errors_impl.InternalError(None, None, error_message)
DataLossError: Unable to open table file /path/to/models/unifiedqa-small: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

Perhaps might be an issue in the huggingface's code for loading from tensorflow, related issue.

Looks like your path was incorrect. Hope the issue is resolved now.