How to use spaCy's NER instead of EntityRuler for Extraction Model? [QUESTION]
Closed this issue · 1 comments
First check
- I used the GitHub search to find a similar issue and didn't find it.
- I searched the Typer documentation, with the integrated search.
- I already searched in Google "How to X in Typer" and didn't find any information.
- I already searched in Google "How to X in Click" and didn't find any information.
Description
How can I substitute the current EntityRuler pipeline used for extraction model to spaCy's NER? The docs mentioned in the Local Entity Linker section that we can use a trained NER model before ann_linker pipeline.
Additional context
I am currently using the ANN Linker for linking aliases present in news articles to their corresponding Wikipedia page. My use-case requires linking of every updated Wikipedia entity to the main wiki page, so it has to process over the complete wiki dump to stay updated with the latest entities and cover all of them. For this reason, training spaCy's entity linker model was infeasible as it took too long to train. SpaCy's entity linker examples mention using EntityRuler as well, so I couldn't figure how to substitute it with NER. The EntityRuler is not a good choice for me to extract the entities, So how can I use spacy's NER for this? Any help would be appreciated.
This project only requires that you have the doc.ents property set in spacy. So this can be done with either an EntityRuler or the standard ner pipeline that spacy offers.