JohnSnowLabs/spark-nlp-workshop

Sentiment Analysis, missclassification

Chertushkin opened this issue · 1 comments

Pipeine from Sentiment_rb.ipynb missclassifies obvious sentences

Steps to Reproduce

  1. Open https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/jupyter/annotation/english/dictionary-sentiment
  2. Load the pipeline analyze_sentiment_ml
  3. Try to annotate Harry Potter is a good movie. You will see that sentiment is positive. That's correct.
  4. Try to annotate Harry Potter is a bad movie. You will see that sentiment is still positive. That's a mistake.
  5. Also, try to annotate Harry Potter. The model will classify it as negative :)

Your Environment

  • Spark-NLP version: 2.0.3
  • Apache Spark version: 2.4.1
  • Operating System and version: Docker
  • Deployment (Docker, Jupyter, Scala, pip, conda, etc.): I have tried your actual Docker container.

We tried to re-train the sentiment models in 2.1.0, in case it doesn't perform well on common sentences, please re-open this issue.