JohnSnowLabs/spark-nlp-workshop

Sentiment_rb.ipynb has issue with downloading pipeline

Chertushkin opened this issue · 7 comments

PretrainedPipeline("movies_sentiment_analysis")

Steps to Reproduce

  1. Pull and run the docker
  2. Run notebook https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/annotation/english/dictionary-sentiment/sentiment_rb.ipynb
  3. Try to launch the cell #3. You will get the exception saying that resource failed to download.
    image

Looks like some dependency inside docker is missing. Could you please check?

Your Environment

  • Spark-NLP version: 2.0.3
  • Apache Spark version: 2.4.1
  • Operating System and version: The latest Docker
  • Deployment (Docker, Jupyter, Scala, pip, conda, etc.): I have pulled the latest docker as described on the main page.

This has been fixed in the latest image.
Fixed pre-trained pipeline's name.

Still not working in the latest docker image:

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline. : java.lang.IllegalArgumentException: requirement failed: Was not found appropriate resource to download for request: ResourceRequest(movies_sentiment_analysis,Some(en),public/models,2.0.4,2.4.0) with downloader: com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader@3dbbf51b

@alinapetukhova
you are not using the latest Docker image or you manually change the name of the pipeline. The correct name is analyze_sentiment_ml and I can see you are still trying to use movies_sentiment_analysis which it doesn't exist in the current Docker image.

@maziyarpanahi i'm downloading it with docker pull johnsnowlabs/spark-nlp-workshop:latest. is it any other way to do it?

it's working correctly in /jupyter/annotation/english/dictionary-sentiment/sentiment_rb.ipynb, but still has a wrong reference in /jupyter/training/english/dictionary-sentiment/sentiment_rb.ipynb

Either manually just rename the pipeline to:

pipeline = PretrainedPipeline("analyze_sentiment_ml")

Or remove all the docker containers/images and do the pull/run again. The docker image:latest is coming from the master. And as you can see in the master is the right pipeline's name:
https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/annotation/english/dictionary-sentiment/sentiment_rb.ipynb

I am gonna take a look, but I did fix it:
a44f2fc