MartinoMensio/spacy-dbpedia-spotlight

interesting features to add

angelosalatino opened this issue · 2 comments

Hi Martino,

as parameter, it would be nice if we were able to set the confidence level.

Hi Angelo,
Thank you for the suggestion! With version 0.2.1 you can control the confidence level. You can set it when you create the pipeline stage, or also change it afterwards.

import spacy
# load your spacy pipeline
nlp = spacy.blank('en')
# add the pipeline stage with the configuration options:
nlp.add_pipe('dbpedia_spotlight', config={'confidence': 0.4})
# use it
text = 'US bought a lot of vaccines'
doc = nlp(text)
# this will print: [(US, 'http://dbpedia.org/resource/United_States')]
print([(ent, ent.kb_id_) for ent in doc.ents])

# you can change the confidence if you have already instantiated the pipeline stage
nlp.get_pipe('dbpedia_spotlight').confidence=0.5
# now recompute the document
doc = nlp(text)
# this now won't have any results
print([(ent, ent.kb_id_) for ent in doc.ents])

You can also change the other parameters of the REST API can be changed: confidence, support, types, sparql and policy. As for the confidence, you can change them in the config dict or by accessing the attribute of the pipe stage. I will expand a bit the documentation as at the moment it is only detailed in #3.

With the current situation, you cannot change the configuration (e.g. confidence) on single docs, you have to do it at the nlp level.

Let me know if this works for you.

Best,
Martino

Very nice,
Angelo