Error downloading pretrained pipeline
csyhuang opened this issue · 6 comments
I'm running the jupyter notebook with the Docker, but when executing
pipeline = PretrainedPipeline('explain_document_dl')
there is a No such file or directory
error. The error messages are included below:
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<timed exec> in <module>
/usr/lib/python3.6/site-packages/sparknlp/pretrained.py in __init__(self, name, lang, remote_loc)
28
29 def __init__(self, name, lang='en', remote_loc=None):
---> 30 self.model = ResourceDownloader().downloadPipeline(name, lang, remote_loc)
31 self.light_model = LightPipeline(self.model)
32
/usr/lib/python3.6/site-packages/sparknlp/pretrained.py in downloadPipeline(name, language, remote_loc)
16 @staticmethod
17 def downloadPipeline(name, language, remote_loc=None):
---> 18 j_obj = _internal._DownloadPipeline(name, language, remote_loc).apply()
19 jmodel = JavaModel(j_obj)
20 return jmodel
/usr/lib/python3.6/site-packages/sparknlp/internal.py in __init__(self, name, language, remote_loc)
63 def __init__(self, name, language, remote_loc):
64 super(_DownloadPipeline, self).__init__("com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline")
---> 65 self._java_obj = self._new_java_obj(self._java_obj, name, language, remote_loc)
66
67
/usr/lib/python3.6/site-packages/pyspark/ml/wrapper.py in _new_java_obj(java_class, *args)
65 java_obj = getattr(java_obj, name)
66 java_args = [_py2java(sc, arg) for arg in args]
---> 67 return java_obj(*java_args)
68
69 @staticmethod
/usr/lib/python3.6/site-packages/py4j/java_gateway.py in __call__(self, *args)
1255 answer = self.gateway_client.send_command(command)
1256 return_value = get_return_value(
-> 1257 answer, self.gateway_client, self.target_id, self.name)
1258
1259 for temp_arg in temp_args:
/usr/lib/python3.6/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
61 def deco(*a, **kw):
62 try:
---> 63 return f(*a, **kw)
64 except py4j.protocol.Py4JJavaError as e:
65 s = e.java_exception.toString()
/usr/lib/python3.6/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
--> 328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(
Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline.
: java.lang.UnsatisfiedLinkError: /tmp/tensorflow_native_libraries-1553289340774-0/libtensorflow_jni.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/tensorflow_native_libraries-1553289340774-0/libtensorflow_jni.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at org.tensorflow.NativeLibrary.load(NativeLibrary.java:101)
at org.tensorflow.TensorFlow.init(TensorFlow.java:66)
at org.tensorflow.TensorFlow.<clinit>(TensorFlow.java:70)
at org.tensorflow.Graph.<clinit>(Graph.java:361)
at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.readGraph(TensorflowWrapper.scala:98)
at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.read(TensorflowWrapper.scala:172)
at com.johnsnowlabs.ml.tensorflow.ReadTensorflowModel$class.readTensorflowModel(TensorflowSerializeModel.scala:57)
at com.johnsnowlabs.nlp.annotators.ner.dl.NerDLModel$.readTensorflowModel(NerDLModel.scala:97)
at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$class.readNerGraph(NerDLModel.scala:84)
at com.johnsnowlabs.nlp.annotators.ner.dl.NerDLModel$.readNerGraph(NerDLModel.scala:97)
at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$$anonfun$2.apply(NerDLModel.scala:88)
at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$$anonfun$2.apply(NerDLModel.scala:88)
at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead$1.apply(ParamsAndFeaturesReadable.scala:31)
at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead$1.apply(ParamsAndFeaturesReadable.scala:30)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$class.com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead(ParamsAndFeaturesReadable.scala:30)
at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$read$1.apply(ParamsAndFeaturesReadable.scala:41)
at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$read$1.apply(ParamsAndFeaturesReadable.scala:41)
at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:19)
at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:8)
at org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstance(ReadWrite.scala:652)
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:274)
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:272)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:272)
at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:348)
at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:342)
at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadPipeline(ResourceDownloader.scala:134)
at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadPipeline(ResourceDownloader.scala:128)
at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader$.downloadPipeline(ResourceDownloader.scala:197)
at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline(ResourceDownloader.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Are the notebooks supposed to run without error in the docker? Thanks!
Hi @csyhuang and thanks for reporting this.
Just to have more details, could you please tell me about your Operating System and the process you took that lead to this error? (pull the image, run the image, and then you opened the Jupyter notebook and which example failed?)
Many thanks
Hi again @csyhuang, I apologize because the issue was our Docker image.
Now everything should work as expected if you can run the docker pull again to download the latest changes, please.
PS: Keep in mind some of the examples with POS()
and OCR
may not work until late Sunday when we release the hotfix.
Thanks for your report and please let us know if you have any other issue with the examples.
docker pull johnsnowlabs/spark-nlp-workshop:latest
docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop
Hi @maziyarpanahi , thanks for fixing the docker image. I tried running the notebook again but another error emerges:
Py4JError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline
Since this ticket is close, I'll open a new one and link it here.
Thanks for getting back to me. Can we try to see if these two commands are the ones you tried?
docker pull johnsnowlabs/spark-nlp-workshop:latest
docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop
PS: Could you paste paste the entire trace of errors?
@maziyarpanahi Yes. I used the two commands you mentioned to run the docker.
I opened the new ticket before seeing your reply... Shall I continue replying here or we get to #31 ?
No worries, we can continue in your new issue.