DynamoDbLock.tryLock() has no chance to acquire an abandoned lock
GarryIV opened this issue · 2 comments
In what version(s) of Spring Integration AWS are you seeing this issue?
>= 2.5.2
Describe the bug
DynamoDbLock.tryLock()
has no chance to acquire an abandoned lock no matter how many times you will try. It worked before #205 but with this fix abandoned locks can't re-acquired using DynamoDbLock.tryLock()
To Reproduce
- create a DynamoDB lock table with a lock (in AWS or in local stack)
- create a DynamoDbLockRegistry with the params that match the table/lock you created previously and try to lock it using 'tryLock()'
@Bean
public DynamoDbLockRegistry dynamoDbLockRegistry(AmazonDynamoDB amazonDynamoDb) {
var lockClientOptions = dynamoDBLockClientOptionsBuilder(amazonDynamoDb).build();
return new DynamoDbLockRegistry(new AmazonDynamoDBLockClient(lockClientOptions));
}
@Bean
public CommandLineRunner runner(AmazonDynamoDB amazonDynamoDb, DynamoDbLockRegistry dynamoDbLockRegistry) {
return args -> {
createAbandonedLock(amazonDynamoDb);
logger.info("Trying to acquire the abandoned lock");
var lock = dynamoDbLockRegistry.obtain(PARTITION_KEY);
while (!lock.tryLock()) {
logger.info("Failed to lock");
Thread.sleep(1000);
}
logger.info("Lock is acquired");
lock.unlock();
logger.info("Done");
};
}
lock.tryLock
cycle will never end.
Expected behavior
Lock is acquired if we try more than lock's lease duration time.
Sample
I've created a sample to reproduce it https://github.com/GarryIV/kinesis-lock-bug
The code that reproduces the problem is here
To reproduce it run
docker-compose up -d
./gradlew bootRun
It's expected that "Lock is acquired
is printed after 20+ seconds bit you will see endless Failed to lock
messages.
NB: We faced this while we were trying to use spring-integration-aws:2.5.2 with the latest spring-cloud-stream-binder-kinesis
-org.springframework.cloud:spring-cloud-stream-binder-kinesis:2.2.0
.
The binder is trying to acquire a lock with no chance to success. Only way to start it is to delete lock record manually.
I also included a sample use case for it, pls check the readme.
Thank you for sharing this with us!
Unfortunately I cannot start your application:
- Even if you marked it with a
@Primary
in theLocalStackConfiguration
, the other bean is still created and it fails with this error:
Caused by: com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) ~[aws-java-sdk-core-1.12.266.jar:na]
at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) ~[aws-java-sdk-core-1.12.266.jar:na]
at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) ~[aws-java-sdk-core-1.12.266.jar:na]
at com.example.lockbug.DynamoDbLockBugConfiguration.amazonDynamoDB(DynamoDbLockBugConfiguration.java:40) ~[main/:na]
- When I remove that bean in the
DynamoDbLockBugConfiguration
, I fail with this error:
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'amazonS3': Invocation of init method failed; nested exception is com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1804) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:620) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:936) ~[spring-beans-5.3.24.jar:5.3.24]
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) ~[spring-context-5.3.24.jar:5.3.24]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) ~[spring-context-5.3.24.jar:5.3.24]
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:731) ~[spring-boot-2.7.7.jar:2.7.7]
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:408) ~[spring-boot-2.7.7.jar:2.7.7]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:307) ~[spring-boot-2.7.7.jar:2.7.7]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1303) ~[spring-boot-2.7.7.jar:2.7.7]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1292) ~[spring-boot-2.7.7.jar:2.7.7]
at com.example.lockbug.LockBugApplication.main(LockBugApplication.java:10) ~[main/:na]
Caused by: com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
at com.amazonaws.regions.AwsRegionProviderChain.getRegion(AwsRegionProviderChain.java:59) ~[aws-java-sdk-core-1.12.266.jar:na]
at io.awspring.cloud.core.region.DefaultAwsRegionProviderChainDelegate.getRegion(DefaultAwsRegionProviderChainDelegate.java:37) ~[spring-cloud-aws-core-2.4.2.jar:2.4.2]
at io.awspring.cloud.core.config.AmazonWebserviceClientFactoryBean.createInstance(AmazonWebserviceClientFactoryBean.java:130) ~[spring-cloud-aws-core-2.4.2.jar:2.4.2]
at io.awspring.cloud.core.config.AmazonWebserviceClientFactoryBean.createInstance(AmazonWebserviceClientFactoryBean.java:52) ~[spring-cloud-aws-core-2.4.2.jar:2.4.2]
at org.springframework.beans.factory.config.AbstractFactoryBean.afterPropertiesSet(AbstractFactoryBean.java:142) ~[spring-beans-5.3.24.jar:5.3.24]
Apparently some other AWS services are trying to be auto-configured and you probably have some AWS SDK configuration locally in the ~/.aws/credentials
.
Anyway I have a code from you so I'll try to reproduce it and play with from JUnit test where I use Testcontainers for LocalStack instead.
Will come back with some finding a bit later...
OK. So, I see (or at least guess) what is going on.
This logic in the AmazonDynamoDBLockClient
has no sense:
if (options.shouldSkipBlockingWait() && existingLock.isPresent() && !existingLock.get().isExpired()) {
That isExpired()
is always true
:
return LockClientUtils.INSTANCE.millisecondTime() - this.lookupTime.get() > this.leaseDuration.get();
where that this.lookupTime
is set exactly as a LockClientUtils.INSTANCE.millisecondTime()
in the getLockFromDynamoDB(getLockOptions)
method call from the beginning of the while (true)
loop in the acquireLock()
.
So, no matter how much we try to wait in between lock.tryLock()
, that isExpired()
is always false
when we do shouldSkipBlockingWait()
.
I will try to find some workaround (at least restore the previous behavior), but it feel like a bug in the DynamoDB Lock Client library by itself.
There is no reasonable lookupTime
to make an item to expire in time for our call when we don't want to wait.
And therefore we end up with an impression that it never expires.