Error Loading 1B model from hugging face
Opened this issue · 1 comments
When i try to load BBT-1-1B
the tokenizer gives me TypeError: not a string
.
After debugging, i found the vocab.txt is not valid for loading.
If i dont set vocab.txt
in the model dir to T5Tokenizer.from_pretrained
, the error is TypeError: not a string
.
If i set vocab.txt
in the model dir, the error changes to RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
.
However, for BBT-2-12B-Text
model, there is spiece.model spiece.vocab
for tokenizer.
Providingspiece.model spiece.vocab
or show an example of using vocab.txt
would be very helpful!
当我尝试加载分词器时,给了我.调试后,我发现vocab.txt无法加载。如果我没有在模型目录中设置为 ,错误是 .如果我在模型目录中设置,则错误将更改为 .
BBT-1-1B``TypeError: not a string``vocab.txt``T5Tokenizer.from_pretrained``TypeError: not a string``vocab.txt``RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
但是,对于模型,有分词器。
BBT-2-12B-Text``spiece.model spiece.vocab
提供或展示使用示例将非常有帮助!
spiece.model spiece.vocab``vocab.txt
I encountered the same problem, how did you solve it? Thanks