audienceproject/spark-dynamodb

How to import the module in AWS Glue PySpark

Macklon opened this issue · 0 comments

pyspark --packages com.audienceproject:spark-dynamodb_2.11:1.0.3

import com.audienceproject.spark.dynamodb

dynamoDf = spark.read.option("tableName", "SomeTableName").format("dynamodb").load()

dynamoDf.show()

Used the above connector and able to read and display data on local machine.

AWS GLUE
Since aws glue is using spark 2.4 downloaded the spark-dynamidb_2.11-1.0.3.jar and uploaded to s3 and mentioned the s3 URI in python library path and dependent jar path.

Had to define schema as it started throwing error as the string cannot be cast to utf8string