Full dependencies list for spacy? (a problem might caused by pydantic)
Opened this issue · 2 comments
I wanted to train a Chinese model and ran the following command from EXPERIMENTS.md:
python src/main.py train \
--train-path "data/ctb_5.1/ctb.train" \
--dev-path "data/ctb_5.1/ctb.dev" \
--text-processing "chinese" \
--use-pretrained --pretrained-model "bert-base-chinese" \
--predict-tags \
--model-path-base models/Chinese_bert_base_chinese
However, I encountered this error:
Traceback (most recent call last):
File "D:\postgraduate\research\parsing\self-attentive-parser\src\main.py", line 11, in <module>
from benepar import char_lstm
File "D:\postgraduate\research\parsing\self-attentive-parser\src\benepar\__init__.py", line 20, in <module>
from .integrations.spacy_plugin import BeneparComponent, NonConstituentException
File "D:\postgraduate\research\parsing\self-attentive-parser\src\benepar\integrations\spacy_plugin.py", line 5, in <module>
from .spacy_extensions import ConstituentData, NonConstituentException
File "D:\postgraduate\research\parsing\self-attentive-parser\src\benepar\integrations\spacy_extensions.py", line 177, in <module>
install_spacy_extensions()
File "D:\postgraduate\research\parsing\self-attentive-parser\src\benepar\integrations\spacy_extensions.py", line 153, in install_spacy_extensions
from spacy.tokens import Doc, Span, Token
File "D:\anaconda\lib\site-packages\spacy\__init__.py", line 14, in <module>
from . import pipeline # noqa: F401
File "D:\anaconda\lib\site-packages\spacy\pipeline\__init__.py", line 1, in <module>
from .attributeruler import AttributeRuler
File "D:\anaconda\lib\site-packages\spacy\pipeline\attributeruler.py", line 6, in <module>
from .pipe import Pipe
File "spacy\pipeline\pipe.pyx", line 8, in init spacy.pipeline.pipe
File "D:\anaconda\lib\site-packages\spacy\training\__init__.py", line 11, in <module>
from .callbacks import create_copy_from_base_model # noqa: F401
File "D:\anaconda\lib\site-packages\spacy\training\callbacks.py", line 3, in <module>
from ..language import Language
File "D:\anaconda\lib\site-packages\spacy\language.py", line 25, in <module>
from .training.initialize import init_vocab, init_tok2vec
File "D:\anaconda\lib\site-packages\spacy\training\initialize.py", line 14, in <module>
from .pretrain import get_tok2vec_ref
File "D:\anaconda\lib\site-packages\spacy\training\pretrain.py", line 16, in <module>
from ..schemas import ConfigSchemaPretrain
File "D:\anaconda\lib\site-packages\spacy\schemas.py", line 216, in <module>
class TokenPattern(BaseModel):
File "pydantic\main.py", line 299, in pydantic.main.ModelMetaclass.__new__
print("Loaded {:,} test examples.".format(len(test_treebank)))
File "pydantic\fields.py", line 411, in pydantic.fields.ModelField.infer
File "pydantic\fields.py", line 342, in pydantic.fields.ModelField.__init__
File "pydantic\fields.py", line 451, in pydantic.fields.ModelField.prepare
File "pydantic\fields.py", line 545, in pydantic.fields.ModelField._type_analysis
File "pydantic\fields.py", line 550, in pydantic.fields.ModelField._type_analysis
File "D:\anaconda\lib\typing.py", line 852, in __subclasscheck__
return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class
This issue says installing two packages chromadb
and pydantic
will work, so I installed them. I ran
python -m pip install -U pydantic spacy
python -m pip install -U chromadb spacy
Now, I have
pydantic == 2.9.2
pydantic-core == 2.23.4
scapy == 3.7.6
typing-extensions == 4.12.2
chromadb == 0.5.9
However, the problem still exists.
According to this issue, this problem should only exist for pydantic
v1.10.7 and earlier, related to the recent release of typing_extensions
v4.6.0. I installed higher versions, but it didn't solve this error.
If all you want to do is train a model, I recommend just disabling the spacy integration by commenting out the line
File "D:\postgraduate\research\parsing\self-attentive-parser\src\benepar\__init__.py", line 20, in <module>
from .integrations.spacy_plugin import BeneparComponent, NonConstituentException
Spacy integration is for inference time only.
No idea what's going on with this error. It sounds like it's inside some libraries, and didn't exist with the versions of those libraries I used way back when releasing benepar. It's disappointing if the underlying cause is that the libraries have introduced bugs or broken backwards compatibility. Here are the versions I have for one of my archived working setups.
pydantic 1.7.3
spacy 3.0.1
spacy-legacy 3.0.1
typing-extensions 3.7.4.3
Thank you for your response. I once tried to downgrade spacy to 3.5.0, and I encountered the following error:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
en-core-web-md 3.7.1 requires spacy<3.8.0,>=3.7.2, but you have spacy 3.5.0 which is incompatible.
However, In your README, you says
The recommended way of using benepar is through integration with spaCy. If using spaCy, you should install a spaCy model for your language. For English, the installation command is:
$ python -m spacy download en_core_web_md
It seems that en_core_web_md
requires a higher version of spacy
. Is that right?