aws-samples/spark-on-aws-lambda

Getting the following error "errorMessage": "An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext",

Mridula-Juluri opened this issue · 3 comments

START RequestId: 62db9b5b-003f-491d-89af-87813284d79b Version: $LATEST
start...................
******* Input path
Warning: Ignoring non-Spark config property: hoodie.meta.sync.client.tool.class
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/03/15 20:31:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ERROR] 2023-03-15T20:35:59.650Z 62db9b5b-003f-491d-89af-87813284d79b Exception while sending command.
Traceback (most recent call last):
File "/var/lang/lib/python3.8/site-packages/py4j/clientserver.py", line 516, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lang/lib/python3.8/site-packages/py4j/java_gateway.py", line 1038, in send_command
response = connection.send_command(command)
File "/var/lang/lib/python3.8/site-packages/py4j/clientserver.py", line 539, in send_command
raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving[ERROR] Py4JError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext
Traceback (most recent call last):
  File "/var/task/sparkLambdaHandler.py", line 19, in lambda_handler
    spark = SparkSession.builder
  File "/var/lang/lib/python3.8/site-packages/pyspark/sql/session.py", line 269, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 483, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 197, in init
    self._do_init(
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 282, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 402, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/var/lang/lib/python3.8/site-packages/py4j/java_gateway.py", line 1585, in call
    return_value = get_return_value(
  File "/var/lang/lib/python3.8/site-packages/py4j/protocol.py", line 334, in get_return_value
    raise Py4JError(
END RequestId: 62db9b5b-003f-491d-89af-87813284d79b
REPORT RequestId: 62db9b5b-003f-491d-89af-87813284d79b Duration: 318317.42 ms Billed Duration: 318686 ms Memory Size: 128 MB Max Memory Used: 128 MB Init Duration: 368.48 ms

@Mridula-Juluri . Thanks for showing the interest. Could you please try the Develop branch, we made some updates to Java versioning and ran some tests?

Also please check the wiki page for the latest information on Local testing before moving to AWS Lambda. The folder spark script has some additional samples. You can use additional frameworks like Apache HUDI, Iceberg and Delta lake.

Tested the latest version and error is resolved