exasol/small-json-files-test-fixture

Read timeout when waiting for Lambda to finish

Closed this issue · 1 comments

Creating 1,000,000 files fails when running in a GitHub action:

2023-04-28 09:56:43.350 [INFO   ] Starting test S3DocumentFilesAdapterIT.testManySmallJsonFiles(TestInfo)... 
2023-04-28 09:56:45.162 [INFO   ] Actual setup TestSetupDescription [hashOfLambdaFunction=d9d6bb2e44c20a2cf1d9c5a8027f970e396c95367e4b4f13da1d0c75bf6ec8a6, numberOfFiles=1000000] does not match requested configuration TestSetupDescription [hashOfLambdaFunction=efac338ae0dd4bf9da2e6715b82f25d1c0c5c82aa83d316c981edae686492a5e, numberOfFiles=1000000] 
2023-04-28 09:56:45.838 [INFO   ] Creating role 'create-json-files-1682675805836-role'... 
2023-04-28 09:56:46.747 [INFO   ] Waiting PT30S for role 'create-json-files-1682675805836-role' being fully created... 
2023-04-28 09:57:17.912 [INFO   ] Waiting PT5S for lambda 'create-json-files-1682675805836' being fully created... 
2023-04-28 09:57:22.913 [INFO   ] Deleting small-json-files test setup from bucket persistent-small-json-files-test-fixture... 
2023-04-28 10:15:13.926 [INFO   ] Delete done in PT17M51.012603S 
2023-04-28 10:15:13.929 [INFO   ] Creating 1,000,000 files using 50 lambda functions in bucket persistent-small-json-files-test-fixture... 
2023-04-28 10:44:49.015 [SEVERE ] Lambda #29 {"bucket":"persistent-small-json-files-test-fixture","offset":580000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: The connection was closed during the request. The request will usually succeed on a retry, but if it does not: consider disabling any proxies you have configured, enabling debug logging, or performing a TCP dump to identify the root cause. If this is a streaming operation, validate that data is being read or written in a timely manner. Channel Information: ChannelDiagnostics(channel=[id: 0xd6feb7c3, L:/10.1.0.39:44348 - R:lambda.eu-central-1.amazonaws.com/3.121.178.242:443], channelAge=PT4M31.196378S, requestCount=1) 
...
 2023-04-28 10:51:46.420 [SEVERE ] Lambda #49 {"bucket":"persistent-small-json-files-test-fixture","offset":980000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out 
2023-04-28 10:51:46.459 [SEVERE ] Lambda #41 {"bucket":"persistent-small-json-files-test-fixture","offset":820000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out 
2023-04-28 10:51:46.685 [SEVERE ] Lambda #16 {"bucket":"persistent-small-json-files-test-fixture","offset":320000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: The connection was closed during the request. The request will usually succeed on a retry, but if it does not: consider disabling any proxies you have configured, enabling debug logging, or performing a TCP dump to identify the root cause. If this is a streaming operation, validate that data is being read or written in a timely manner. Channel Information: ChannelDiagnostics(channel=[id: 0xf12c8c70, L:/10.1.0.39:48326 - R:lambda.eu-central-1.amazonaws.com/3.121.179.234:443], channelAge=PT4M31.603114S, requestCount=1) 
...
2023-04-28 10:51:46.887 [SEVERE ] Lambda #26 {"bucket":"persistent-small-json-files-test-fixture","offset":520000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: The connection was closed during the request. The request will usually succeed on a retry, but if it does not: consider disabling any proxies you have configured, enabling debug logging, or performing a TCP dump to identify the root cause. If this is a streaming operation, validate that data is being read or written in a timely manner. Channel Information: ChannelDiagnostics(channel=[id: 0x9e7a13e1, L:/10.1.0.39:48402 - R:lambda.eu-central-1.amazonaws.com/3.121.179.234:443], channelAge=PT4M31.628093S, requestCount=1) 
2023-04-28 10:51:47.324 [SEVERE ] Lambda #1 {"bucket":"persistent-small-json-files-test-fixture","offset":20000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out 
...
2023-04-28 11:03:15.478 [SEVERE ] Lambda #9 {"bucket":"persistent-small-json-files-test-fixture","offset":180000,"numberOfFiles":20000,"prefix":"test-data-","action":"create"} failed :software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out 

Error:  com.exasol.adapter.document.files.S3DocumentFilesAdapterIT.testManySmallJsonFiles(TestInfo)  Time elapsed: 3,994.43 s  <<< ERROR!
java.lang.IllegalStateException: One or more lambda functions failed.
	at com.exasol.smalljsonfilesfixture.S3TestSetupLambdaController.waitForLambdasToFinish(S3TestSetupLambdaController.java:279)
	at com.exasol.smalljsonfilesfixture.S3TestSetupLambdaController.runLambdas(S3TestSetupLambdaController.java:248)
	at com.exasol.smalljsonfilesfixture.S3TestSetupLambdaController.createFiles(S3TestSetupLambdaController.java:95)
	at com.exasol.smalljsonfilesfixture.SmallJsonFilesTestSetup.setup(SmallJsonFilesTestSetup.java:45)
	at com.exasol.adapter.document.files.S3DocumentFilesAdapterIT.createTestSetupWithSmallJsonFiles(S3DocumentFilesAdapterIT.java:217)
	at com.exasol.adapter.document.files.S3DocumentFilesAdapterIT.testManySmallJsonFiles(S3DocumentFilesAdapterIT.java:152)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:317)
	at io.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:98)
	at io.netty.handler.timeout.ReadTimeoutHandler.channelIdle(ReadTimeoutHandler.java:90)
	at io.netty.handler.timeout.IdleStateHandler$ReaderIdleTimeoutTask.run(IdleStateHandler.java:503)
	at io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:475)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	... 1 more
Caused by: io.netty.handler.timeout.ReadTimeoutException

See complete log at https://github.com/exasol/s3-document-files-virtual-schema/actions/runs/4828138085/jobs/8602023931

Workaround

Run the following code locally:

        final int count = 1_000_000;
        final int countPerLambda = 20_000;
        new SmallJsonFilesTestSetup().setup(Map.of("exa:project", "<project>", "exa:owner", "<email>"),
                "<bucket>", ProfileCredentialsProvider.create("<AWS account>"), count,
                countPerLambda);

Possible root cause: log shows, that timeout occurs after after 4m31: channelAge=PT4M31.196378S