Jcharis/Natural-Language-Processing-Tutorials

Entities are not being recognised

oldmonkABA opened this issue · 4 comments

When i run the code I get following output

docx = nlp(u"I am looking for an Italian Restaurant where I can eat")
for word in docx.ents:
print("value",word.text,"entity",word.label_,"start",word.start_char,"end",word.end_char)
('value', u'Italian', 'entity', u'NORP', 'start', 20, 'end', 27)

print(interpreter.parse(u"I am looking for an Italian Restaurant where I can eat"))
{u'entities': [], u'intent': {u'confidence': '0.7245936400661538', u'name': u'restaurant_search'}, 'text': u'I am looking for an Italian Restaurant where I can eat', u'intent_ranking': [{u'confidence': '0.7245936400661538', u'name': u'restaurant_search'}, {u'confidence': '0.16613318075824324', u'name': u'affirm'}, {u'confidence': '0.061131622985489784', u'name': u'greet'}, {u'confidence': '0.04814155619011318', u'name': u'goodbye'}]}

print(interpreter.parse(u"I want an African Spot to eat"))
{u'entities': [], u'intent': {u'confidence': '0.6742354477482855', u'name': u'restaurant_search'}, 'text': u'I want an African Spot to eat', u'intent_ranking': [{u'confidence': '0.6742354477482855', u'name': u'restaurant_search'}, {u'confidence': '0.12795773626363155', u'name': u'affirm'}, {u'confidence': '0.1248807660919913', u'name': u'goodbye'}, {u'confidence': '0.07292604989609185', u'name': u'greet'}]}

print(interpreter.parse(u"Good morning World"))
{u'entities': [], u'intent': {u'confidence': '0.3928691488396195', u'name': u'greet'}, 'text': u'Good morning World', u'intent_ranking': [{u'confidence': '0.3928691488396195', u'name': u'greet'}, {u'confidence': '0.2737002194915276', u'name': u'goodbye'}, {u'confidence': '0.17752522806694152', u'name': u'affirm'}, {u'confidence': '0.15590540360191174', u'name': u'restaurant_search'}]}

Below is the full code :
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUModelConfig
from rasa_nlu.model import Trainer
from rasa_nlu import config

Loading DataSet

train_data = load_data('./data/data.json')

Config Backend using Sklearn and Spacy

trainer = Trainer(config.load("config.yaml"))

Training Data

trainer.train(train_data)

Returns the directory the model is stored in (Creat a folder to store model in)

model_directory = trainer.persist('./projects/')

import spacy
nlp = spacy.load('en')

docx = nlp(u"I am looking for an Italian Restaurant where I can eat")
for word in docx.ents:
print("value",word.text,"entity",word.label_,"start",word.start_char,"end",word.end_char)

from rasa_nlu.model import Metadata, Interpreter

where `model_directory points to the folder the model is persisted in

interpreter = Interpreter.load(model_directory)

Prediction of Intent

print(interpreter.parse(u"I am looking for an Italian Restaurant where I can eat"))
print(interpreter.parse(u"I want an African Spot to eat"))
print(interpreter.parse(u"Good morning World"))

In my observation, the intent is working well but the problem may be due to spacy-rasa config or the training data. Pls you can try these options
#1 Add more examples for your training data during the training
#2 you can use a more detailed pipeline for the config.yml file (spacy_sklearn)
#3 you can try it with mitie backend
Hope it helps

Hi Jcharis, i have the same problem. When i run your code, i have a result like this :

`
{
'intent': {
'name': 'restaurant_search',
'confidence': 0.6966200345414107
},
'entities': [

],
'intent_ranking': [
{
'name': 'restaurant_search',
'confidence': 0.6966200345414107
},
{
'name': 'affirm',
'confidence': 0.19163192173218538
},
{
'name': 'goodbye',
'confidence': 0.05679537002111616
},
{
'name': 'greet',
'confidence': 0.054952673705287634
}
],
'text': 'I am looking for an Italian Restaurant where I can eat'

Entities are not recognized, may you help me ?

Thk for you answer,
I found a solution, the probleme came from my pipeline file (as you said)

Here is the pipeline i use :

language: "en"

pipeline:

  • name: "tokenizer_whitespace"
  • name: "ner_crf"
  • name: "ner_synonyms"
  • name: "intent_featurizer_count_vectors"
  • name: "intent_classifier_tensorflow_embedding"
    "epochs" : 500