How to import the module in AWS Glue PySpark
Macklon opened this issue · 0 comments
Macklon commented
pyspark --packages com.audienceproject:spark-dynamodb_2.11:1.0.3
import com.audienceproject.spark.dynamodb
dynamoDf = spark.read.option("tableName", "SomeTableName").format("dynamodb").load()
dynamoDf.show()
Used the above connector and able to read and display data on local machine.
AWS GLUE
Since aws glue is using spark 2.4 downloaded the spark-dynamidb_2.11-1.0.3.jar and uploaded to s3 and mentioned the s3 URI in python library path and dependent jar path.
Had to define schema as it started throwing error as the string cannot be cast to utf8string