dwadden/dygiepp

Installing Spacy model manually

jmicher opened this issue · 2 comments

Is there a way to install the 'en' spacy model manually? My organization is blocking access to raw.githubusercontent.com. Following instructions here: https://github.com/dwadden/dygiepp#creating-the-dataset-1, the command "python -m spacy download en" is throwing an error for me.

So far, from https://github.com/explosion/spacy-models/blob/master/compatibility.json, I figured out that I need model version 2.0.0 to be compatible with the version of spacy==2.0.18

I found it here: https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz, and then installed it via:

python3 -m pip install /path/to/en_core_web_sm-2.0.0.tgz

But now when I run:

python ./scripts/data/ace-event/parse_ace_event.py default-settings

I'm getting this error:

OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to
a data directory.

I think because it's not finding the shortcuts-v2.json. I'm not sure where to put this, or if that's even a good solution.

Any help would be greatly appreciated.

Ok, I think figured it out. I changed line 345 in parse_ace_event.py

from: nlp = spacy.load('en')
to: nlp = spacy.load('en_core_web_sm')

and it seems to be working now.

Ah I see. From what I can tell, en is just an alias for en_core_web_sm anyhow. So, what you're doing should be fine.
Sorry for prematurely closing the previous issues. If you're all set, go ahead and close this one.