Exception: Java gateway process exited before sending its port number
max-85 opened this issue ยท 6 comments
Description
Notebook:
import sparknlp
spark = sparknlp.start()
leads to the error "Exception: Java gateway process exited before sending its port number"
Best guess is JAVA_HOME is not set in the docker environment (see jupyter/notebook#743)
Steps to Reproduce
- Just use the docker image and start the notebook; run the cell
Your Environment
Provided docker setup mentioned in this repo
- Spark-NLP version:
- Apache Spark version:
- Operating System and version:
- Deployment (Docker, Jupyter, Scala, pip, conda, etc.):
This is strange, it is set already and working fine:
Line 46 in e1f6e90
What is your OS?
Its Windows 10 64bit Enterprise - v1809.
I am using Docker Desktop 2.1.0.0 to execute to docker commands.
Description
Notebook:
import sparknlp
spark = sparknlp.start()leads to the error "Exception: Java gateway process exited before sending its port number"
Best guess is JAVA_HOME is not set in the docker environment (see jupyter/notebook#743)
Steps to Reproduce
- Just use the docker image and start the notebook; run the cell
Your Environment
Provided docker setup mentioned in this repo
- Spark-NLP version:
- Apache Spark version:
- Operating System and version:
- Deployment (Docker, Jupyter, Scala, pip, conda, etc.):
You are getting this error because the environment variable HADOOP_CONF_DIR has not been set.
Inside jupyter notebook try:
%env HADOOP_CONF_DIR=/{path_to_hadoop}/etc/hadoop
Hi,
I am trying to use USE embedding sentence model and I tried "ClassifierDL_Train_multi_class_news_category_classifier" sample for testing. however I am getting the error in "use = UniversalSentenceEncoder.pretrained() " line . the error is "TypeError: 'JavaPackage' object is not callable".
I set the following line for spark config :
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.11:2.4.5")
but I am getting error "Exception: Java gateway process exited before sending its port number"
based on your text I set my JAVA and Hadoop environment vars. but didn't help.
I am working on Mac , and using Intellij .
may you please help me for fixing this issue.
thanks
@ntaherkhani Using spark-nlp in Python requires pip install spark-nlp
in the same environment. If you are in PyCharm or Intellij you have to make sure a valid Python environment that points to an environment is set.
To test, you can follow this:
$ conda create -n sparknlp python=3.6 -y
$ conda activate sparknlp
$ pip install spark-nlp==2.4.5 pyspark==2.4.4
Then go to Python console by typing python
and follow the example there:
# Import Spark NLP
from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp.pretrained import PretrainedPipeline
import sparknlp
# Start Spark Session with Spark NLP
spark = sparknlp.start()
print("Spark NLP version")
sparknlp.version()
print("Apache Spark version")
spark.version
document = DocumentAssembler()\
.setInputCol("description")\
.setOutputCol("document")
use = UniversalSentenceEncoder.pretrained() \
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")
# the classes/labels/categories are in category column
classsifierdl = ClassifierDLApproach()\
.setInputCols(["sentence_embeddings"])\
.setOutputCol("class")\
.setLabelColumn("category")\
.setMaxEpochs(10)\
.setEnableOutputLogs(True)
pipeline = Pipeline(
stages = [
document,
use,
classsifierdl
])
This is not a complete example, but it brings out the correct errors if there is any. If this passes, then it's your Intellij's setup that does not correctly have access to JAVA_HOME and the correct Python environment.
PS: This Exception: Java gateway process exited before sending its port number
error happens when either Java is not 8, it's not present in the default PATH, the PyPI spark-nlp
package is not installed or present in the same environment.
@ntaherkhani Please follow the instructions and the code and please create a new issue with all the versions, setup, full code and the full error.