OSError: Can't load config for 'xlm-roberta-base'.
Closed this issue · 2 comments
Hello everyone,
I get an error since a few days when running a Pipeline.
I use a fresh install of python 3.8 with trankit 1.1.1 .
Here is the code to reproduce :
# test_trankit.py
from trankit import Pipeline
p = Pipeline(lang='english')
and here is the error I get :
Downloading: 100%|████████████████████████████████████████████████████████████████| 5.07M/5.07M [00:06<00:00, 733kB/s]
http://nlp.uoregon.edu/download/trankit/v1.0.0/xlm-roberta-base/english.zip
Downloading: 100%|██████████████████████████████████████████████████████████████| 47.9M/47.9M [00:03<00:00, 12.2MiB/s]
Loading pretrained XLM-Roberta, this may take a while...
Traceback (most recent call last):
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/adapter_transformers/configuration_utils.py", line 234, in get_config_dict
resolved_config_file = cached_path(
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/adapter_transformers/file_utils.py", line 267, in cached_path
raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file xlm-roberta-base/config.json not found
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test_trankit.py", line 23, in <module>
p = Pipeline(lang='english')
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/pipeline.py", line 82, in __init__
self._embedding_layers = Multilingual_Embedding(self._config)
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/models/base_models.py", line 55, in __init__
super(Multilingual_Embedding, self).__init__(config, task_name=model_name)
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/models/base_models.py", line 13, in __init__
self.xlmr = XLMRobertaModel.from_pretrained(config.embedding_name,
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/adapter_transformers/modeling_utils.py", line 578, in from_pretrained
config, model_kwargs = cls.config_class.from_pretrained(
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/adapter_transformers/configuration_utils.py", line 202, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/kirian/miniconda3/envs/venv38/lib/python3.8/site-packages/trankit/adapter_transformers/configuration_utils.py", line 253, in get_config_dict
raise EnvironmentError(msg)
OSError: Can't load config for 'xlm-roberta-base'. Make sure that:
- 'xlm-roberta-base' is a correct model identifier listed on 'https://huggingface.co/models'
- or 'xlm-roberta-base' is the correct path to a directory containing a config.json file
I've tried logging stuff in the trankit code (in the cached_path
method), but I didn't succeed to debut it.
I am suspecting a change in the huggingface pretrained model config (the config.json file being named differently), but I don't know enough context/history to go further in the debugging.
Thanks for your help !
Have you found a solution to this problem? Because I'm facing the same problem!
Hi @kirianguiller @peshmerge ,
Thanks for letting us know.
This issue might be due to the confusion of Trankit about the folder containing the cached models.
It can be usually solved by deleting all cached model files and download the Trankit models again.
Please reopen this issue if you're still facing it.