audienceproject/spark-dynamodb

AWS token can expire during job

Opened this issue · 0 comments

Hi! I ran a quite lengthy job, doing a full scan on a large table (lasted 4 hours, but I think the first failure happened after one) and got this:

com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The security token included in the request is expired (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ExpiredTokenException; Request ID: OID2MIEVCRSJLJ671O7R50I44VVV4KQNSO5AEMVJF66Q9ASUAAJG)
Stack trace
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:4805)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:4772)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeScan(AmazonDynamoDBClient.java:3031)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.scan(AmazonDynamoDBClient.java:2997)
at com.amazonaws.services.dynamodbv2.document.internal.ScanPage.nextPage(ScanPage.java:96)
at com.amazonaws.services.dynamodbv2.document.internal.PageIterator.next(PageIterator.java:47)
at com.amazonaws.services.dynamodbv2.document.internal.PageIterator.next(PageIterator.java:25)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)
at com.audienceproject.spark.dynamodb.datasource.DynamoReaderFactory$ScanPartitionReader.nextPage(DynamoReaderFactory.scala:87)
at com.audienceproject.spark.dynamodb.datasource.DynamoReaderFactory$ScanPartitionReader.next(DynamoReaderFactory.scala:69)
at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:79)
at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:112)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:757)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:488)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$2(FileFormatWriter.scala:346)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1677)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:355)
... 19 more

What I think is happening is a token is being generated only on the initial scan call (when the dynamo client is being acquired) and never renewed, hence the expiration after some time during the scan.

A possible solution for this would be having some sort of background task that would periodically call .refresh() on the credentials provider - I assume this is currently not happening. What do you think?