ParseError exceptions don't retry
Closed this issue · 5 comments
sihil commented
We've seen a few occasions where the XML returned from a describeAutoScalingGroups
API call fails due to a ParseError
. We've now got the exception (below) that causes this so we should be able to write some better retry handling. See #409 for further details.
We originally thought that it was an IOException
that was causing this, but that's not true - it's an javax.xml.stream.XMLStreamException
.
magenta.FailException: Unhandled exception in task SuspendAlarmNotifications Suspending Alarm Notifications - group will no longer scale on any configured alarms
at magenta.DeployReporter$.magenta$DeployReporter$$failException(logging.scala:128)
at magenta.DeployReporter$.failException(logging.scala:131)
at magenta.DeployReporter$.withFailureHandling(logging.scala:106)
at magenta.DeployReporter$.magenta$DeployReporter$$sendContext(logging.scala:118)
at magenta.DeployReporter.taskContext(logging.scala:33)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1$$anonfun$apply$2.apply(TasksRunner.scala:36)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1$$anonfun$apply$2.apply(TasksRunner.scala:27)
at scala.collection.immutable.List.foreach(List.scala:381)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1.apply(TasksRunner.scala:27)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1.apply(TasksRunner.scala:25)
at magenta.DeployReporter$$anonfun$magenta$DeployReporter$$sendContext$1.apply(logging.scala:119)
at magenta.DeployReporter$$anonfun$magenta$DeployReporter$$sendContext$1.apply(logging.scala:118)
at magenta.DeployReporter$.withFailureHandling(logging.scala:98)
at magenta.DeployReporter$.magenta$DeployReporter$$sendContext(logging.scala:118)
at magenta.DeployReporter.infoContext(logging.scala:38)
at deployment.actors.TasksRunner$$anonfun$receive$1.applyOrElse(TasksRunner.scala:25)
at akka.actor.Actor$class.aroundReceive(Actor.scala:484)
at deployment.actors.TasksRunner.aroundReceive(TasksRunner.scala:12)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
Caused by: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[2841,5]
at deployment.actors.TasksRunner.aroundReceive(TasksRunner.scala:12)
at deployment.actors.TasksRunner.aroundReceive(TasksRunner.scala:12)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
Caused by: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[2841,5]
Message: Read timed out). Response Code: 200, Response Text: OK
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1305)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:908)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.doInvoke(AmazonAutoScalingClient.java:3422)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.invoke(AmazonAutoScalingClient.java:3392)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.describeAutoScalingGroups(AmazonAutoScalingClient.java:1280)
at magenta.tasks.ASG$.listAutoScalingGroups$1(AWS.scala:176)
at magenta.tasks.ASG$.groupForAppAndStage(AWS.scala:184)
at magenta.tasks.ASGTask$class.execute(ASGTasks.scala:150)
at magenta.tasks.SuspendAlarmNotifications.execute(ASGTasks.scala:122)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1$$anonfun$apply$2$$anonfun$apply$4.apply(TasksRunner.scala:37)
at deployment.actors.TasksRunner$$anonfun$receive$1$$anonfun$applyOrElse$1$$anonfun$apply$2$$anonfun$apply$4.apply(TasksRunner.scala:36)
at magenta.DeployReporter$$anonfun$magenta$DeployReporter$$sendContext$1.apply(logging.scala:119)
at magenta.DeployReporter$$anonfun$magenta$DeployReporter$$sendContext$1.apply(logging.scala:118)
at magenta.DeployReporter$.withFailureHandling(logging.scala:98)
at magenta.DeployReporter$.magenta$DeployReporter$$sendContext(logging.scala:118)
at magenta.DeployReporter.taskContext(logging.scala:33)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[2841,5]
Message: Read timed out
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:591)
at com.sun.xml.internal.stream.XMLEventReaderImpl.peek(XMLEventReaderImpl.java:276)
at com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:220)
at com.amazonaws.services.autoscaling.model.transform.AutoScalingGroupStaxUnmarshaller.unmarshall(AutoScalingGroupStaxUnmarshaller.java:46)
at com.amazonaws.services.autoscaling.model.transform.DescribeAutoScalingGroupsResultStaxUnmarshaller.unmarshall(DescribeAutoScalingGroupsResultStaxUnmarshaller.java:56)
at com.amazonaws.services.autoscaling.model.transform.DescribeAutoScalingGroupsResultStaxUnmarshaller.unmarshall(DescribeAutoScalingGroupsResultStaxUnmarshaller.java:33)
at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:101)
at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:43)
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1260)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:908)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.doInvoke(AmazonAutoScalingClient.java:3422)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.invoke(AmazonAutoScalingClient.java:3392)
at com.amazonaws.services.autoscaling.AmazonAutoScalingClient.describeAutoScalingGroups(AmazonAutoScalingClient.java:1280)
at magenta.tasks.ASG$.listAutoScalingGroups$1(AWS.scala:176)
at magenta.tasks.ASG$.groupForAppAndStage(AWS.scala:184)
at magenta.tasks.ASGTask$class.execute(ASGTasks.scala:150)
sihil commented
This is a corresponding issue on the SDK: aws/aws-sdk-java#892
jfsoul commented
It looks as though this may have happened again today https://riffraff.gutools.co.uk/deployment/view/4161403a-afb7-4f0c-b65c-a5f8633be50c