Custom receive timeout for nack with SQS using JMS
enVolt opened this issue ยท 16 comments
The behavior of negative acknowledgment is to change the visibility timeout of a received message to 0. Where the value of NACK_TIMEOUT is not configurable while creating the SQS Factory for JMS.
When a message is received, and the processing fails (Listener method throws an error), the message is immediately received again. In most of the cases, the message can be processed with a certain delay.
Is it possible to configure it to not change the visibility timeout, so it respects the Queue's default Receive Timeout configuration?
PS - I originally asked on Stackoverflow
Anyone?
I am having the same issue. This solution or the ability to use the SQS Default Visibility Timeout would be nice.
Same here, immediate re-delivery on NACK results in message quickly ending up in dead letter
We are having the same problem. Using the default that is set on sqs seems to be better behavior.
This seems to be done here : #72 but the pull request is closed
Same here. New news on this?
Workaround on stackoverflow https://stackoverflow.com/a/55187272/1079659
Thanks for the workaround @cmyker! Downside is it affects all change visibility requests... but it solves my immediate issue. +1 for incorporating an option into the library
See my answer here, I believe this is the most ideal way to keep visibilityTimeout
set on the queue itself if you're using Spring JMS.
This default behavior of the library is pretty strange though. If you were going to set it to 0 for every failed message by default, then why do we have visibilityTimeout
attribute on the queue itself in the first place?
For those who use spring-jms: I have just tested the @sedooe solution in my local environment with an elasticmq, and it worked.
public class CustomJmsListenerContainerFactory extends DefaultJmsListenerContainerFactory {
@Override
protected DefaultMessageListenerContainer createContainerInstance() {
return new CustomMessageListenerContainer();
}
}
public class CustomMessageListenerContainer extends DefaultMessageListenerContainer {
public CustomMessageListenerContainer() {
super();
}
@Override
protected void rollbackOnExceptionIfNecessary(Session session, Throwable ex) throws JMSException {
// do nothing
}
@Override
protected void rollbackIfNecessary(Session session) throws JMSException {
super.rollbackIfNecessary(session);
}
}
And then:
CustomJmsListenerContainerFactory factory = new CustomJmsListenerContainerFactory();
I didn't like it too much, but it solves my problem.
I think spring have much to learn with Akka/Alpakka, when it comes to messaging
I think spring have much to learn with Akka/Alpakka, when it comes to messaging
I'm glad it worked for you but this strange behavior is coming from this amazon-sqs-java-messaging-lib and has nothing to do with Spring. ๐
I think spring have much to learn with Akka/Alpakka, when it comes to messaging
https://doc.akka.io/docs/alpakka/current/sqs.htmlI'm glad it worked for you but this strange behavior is coming from this amazon-sqs-java-messaging-lib and has nothing to do with Spring. ๐
Sorry, my mistake
I was so frustrated that i wasn't thinking straight haha
More than 2 years later still having to use the workaround :)?
This default behavior of the library is pretty strange though. If you were going to set it to 0 for every failed message by default, then why do we have
visibilityTimeout
attribute on the queue itself in the first place?
this is exactly what is perplexing us.
we actually have been relying on the visibilityTimeout to give us a simple backoff strategy, only today have we realised that if there is an error downstream or in handling a particular message, the result of this forcing to no wait is that it will attempt to DOS the downstream service.
at first and second looks this is very wrong :(
I too am saddened that there is no fix for this yet. We process messages which involve calls to external apis which have rate limiting. when messages are retried due to rate limit errors, the immediate retries only exacerbate the problem and increase the chances that the message will quickly exhaust all the retries and fail completely.
i investigated some of the options above. @sedooe's solution of overriding the rollbackOnExceptionIfNecessary()
method is a fairly simple solution, however, when i looked into the the recover()
method which is skipped, i noticed that the unacked messages won't be cleaned out. i'm a little worried that that could cause issues over time.
ultimately, we are forced to go the route of forking the library and directly modifying the negative ack logic for now. would love to see this issue fixed.
here's a somewhat hackey solution that works by creating an ExecutionInterceptor, and it doesn't rely on the queue's message visibility expiring, and you can configure what's essentially a fixed backoff with a value of your choice
SqsClientBuilder clientBuilder = SqsClient.builder()
.overrideConfiguration(ClientOverrideConfiguration.builder()
.addExecutionInterceptor(new ExecutionInterceptor() {
@Override
public SdkRequest modifyRequest(Context.ModifyRequest context,
ExecutionAttributes executionAttributes) {
if (context.request() instanceof ChangeMessageVisibilityBatchRequest request) {
List<ChangeMessageVisibilityBatchRequestEntry> newEntries = request.entries()
.stream()
.map(entry -> entry.toBuilder()
.visibilityTimeout(sqsProps.getFixedBackoffSeconds())
.build())
.toList();
return request.toBuilder().entries(newEntries).build();
}
return context.request();
}
})
.build())
.region(Region.US_WEST_2);
by default the aws jms sdk will set the visibility timeout to 0 when there's an exception processing a message, this allows us to override 0 to some other value to delay the retry