JohnSnowLabs/spark-nlp-workshop

Error when running notebooks: Answer from Java side is empty

Jaganatha opened this issue · 3 comments

Hello Team,

bert_model = BertEmbeddings.pretrained('bert_base_cased', 'en').setInputCols(["sentence",'token']).setOutputCol("bert").setCaseSensitive(False).setPoolingLayer(0)

df_bert_train = bert_model.transform(sparkNLP_transformed_full_train)

nerTagger = NerDLApproach().setInputCols(["sentence", "token", "bert"]).setLabelColumn("label").setOutputCol("ner")
.setMaxEpochs(1).setRandomSeed(0).setVerbose(1).setValidationSplit(0.2).setEvaluationLogExtended(True).setEnableOutputLogs(True).setIncludeConfidence(True)

#above code were successful and I can see bert embeddings added to the traindata.However the below code gives error
ner_tag_model_final = nerTagger.fit(df_bert_train)

I am trying to create a NER DL model and I am successful in creating the pipeline.

However, when I feed the train data to fit() the model, I am receiving following error

Exception happened during processing of request from ('127.0.0.1', 47346)
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/py4j/java_gateway.py", line 1159, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/py4j/java_gateway.py", line 985, in send_command
response = connection.send_command(command)
File "/usr/local/lib/python3.6/dist-packages/py4j/java_gateway.py", line 1164, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
File "/usr/lib/python3.6/socketserver.py", line 320, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python3.6/socketserver.py", line 351, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python3.6/socketserver.py", line 364, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python3.6/socketserver.py", line 724, in init
self.handle()
File "/usr/local/lib/python3.6/dist-packages/pyspark/accumulators.py", line 269, in handle
poll(accum_updates)
File "/usr/local/lib/python3.6/dist-packages/pyspark/accumulators.py", line 241, in poll
if func():
File "/usr/local/lib/python3.6/dist-packages/pyspark/accumulators.py", line 245, in accum_updates
num_updates = read_int(self.rfile)
File "/usr/local/lib/python3.6/dist-packages/pyspark/serializers.py", line 717, in read_int
raise EOFError
EOFError


Py4JError Traceback (most recent call last)
in ()
----> 1 ner_tag_model_final = nerTagger.fit(df_bert_train)

5 frames
/usr/local/lib/python3.6/dist-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
334 raise Py4JError(
335 "An error occurred while calling {0}{1}{2}".
--> 336 format(target_id, ".", name))
337 else:
338 type = answer[1]

Py4JError: An error occurred while calling o860.fit

Could you please complete this template (also please be detail in how and where you are running this code):

Description

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context

Your Environment

  • Spark NLP version:
  • Apache NLP version:
  • Java version (java -version):
  • Setup and installation (Pypi, Conda, Maven, etc.):
  • Operating System and version:
  • Link to your project (if any):

The error doesn't suggest Spark NLP, it's more about Apache Spark and Java. Without having the template I am afraid I can't guess what could be wrong, you're code, you're dataset, Apache Spark or Spark NLP.
In addition to all the info required in the template, please also provide the full snippet code from the line you read your dataset to CoNLL() class and all the processing. You are getting an error in Apache Spark which means it could be in one of those lines.