StartAsync() hangs when using Windows with wsl2 container
srollinet opened this issue · 8 comments
Hi,
I try to use TestContainers.Container.Database.MsSql on a Windows system with DockerDesktop running with wsl2 engine.
When calling StartAsync
the call blocks then times out. The container is started, but it seems that the application still wait for something that does not happen.
Sometimes I have a timeout, sometimes I have an exception saying that the container is not found
Docker.DotNet.DockerContainerNotFoundException
Docker API responded with status code=NotFound, response={"message":"No such container: d63a51529be2fa36b01e69043ea31909c21f3e2c7a87fa6f836f03808954a4be"}
with docker ps, I see the container
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d63a51529be2 quay.io/testcontainers/ryuk:0.2.3 "/app" 28 seconds ago Up 27 seconds 0.0.0.0:49159->8080/tcp, :::49159->8080/tcp inspiring_poitras
First, I thought that it was an issue with ryuk, so I tried to disable it via REAPER_DISABLED
env variable, but same thing happens with the mssql server container
I also tried to update to the last version of Docker.DotNet
but I still have the same problem
My env is
Windows 10 pro 21H1
Docker desktop 4.0.1
.Net 5.0.40
TestContainers.Container.Database.MsSql 1.5.0
Docker.DotNet 3.125.5
The container configuration is
new ContainerBuilder<MsSqlContainer>()
.ConfigureDockerImageName("mcr.microsoft.com/mssql/server:2019-latest")
.ConfigureDatabaseConfiguration("not-used", "yourStrong(!)Password", "not-used")
.Build()
Edit: the logs say that the container has started
dbug: TestContainers.Container.Abstractions.DockerClient.DockerClientFactory[0]
Testing provider: NpipeDockerClientProvider
dbug: TestContainers.Container.Abstractions.DockerClient.DockerClientFactory[0]
Provider[NpipeDockerClientProvider] found
local npipe: [npipe://./pipe/docker_engine]
dbug: TestContainers.Container.Abstractions.DockerClient.DockerClientFactory[0]
Testing provider: NpipeDockerClientProvider
dbug: TestContainers.Container.Abstractions.DockerClient.DockerClientFactory[0]
Provider[NpipeDockerClientProvider] found
local npipe: [npipe://./pipe/docker_engine]
dbug: TestContainers.Container.Database.MsSql.MsSqlContainer[0]
Adding session labels to generic container: 1b9e7108-34b0-4274-9045-3ec9d00a5620
dbug: TestContainers.Container.Database.MsSql.MsSqlContainer[0]
Starting reaper ...
dbug: TestContainers.Container.Abstractions.Reaper.ResourceReaper[0]
Starting ryuk container ...
dbug: TestContainers.Container.Abstractions.Images.GenericImage[0]
Image already exists, not pulling: quay.io/testcontainers/ryuk:0.2.3
dbug: TestContainers.Container.Abstractions.Reaper.RyukContainer[0]
Creating container for image: quay.io/testcontainers/ryuk:0.2.3
dbug: TestContainers.Container.Abstractions.Reaper.RyukContainer[0]
Starting container with id: d933bcad92376dd67c89675e3c3883df855885de00e31e2244a9eaae750c546c
dbug: TestContainers.Container.Abstractions.Reaper.RyukContainer[0]
Container quay.io/testcontainers/ryuk:0.2.3 startup complete
fail: TestContainers.Container.Abstractions.Reaper.RyukContainer[0]
Unable to start container services: quay.io/testcontainers/ryuk:0.2.3
TestContainers.Container.Abstractions.Exceptions.ContainerLaunchException: Container.Abstractions startup failed
---> Polly.Timeout.TimeoutRejectedException: The delegate executed asynchronously through TimeoutPolicy did not complete within the timeout.
---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates
1 shouldRetryResultPredicates, Func5 onRetryAsync, Int32 pe rmittedRetryCount, IEnumerable
1 sleepDurationsEnumerable, Func4 sleepDurationProvider, Boolean continueOnCapturedContext) at Polly.AsyncPolicy.ExecuteAsync(Func
3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
at Polly.Wrap.AsyncPolicyWrapEngine.<>c__DisplayClass4_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at Polly.AsyncPolicy.<>c__DisplayClass40_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at Polly.Timeout.AsyncTimeoutEngine.ImplementationAsync[TResult](Func3 action, Context context, CancellationToken cancellationToken, Func
2 timeoutProvider, TimeoutStrategy timeoutStrategy, Func5 onTimeoutAsync, Boolean continueOnCapturedContext) --- End of inner exception stack trace --- at Polly.Timeout.AsyncTimeoutEngine.ImplementationAsync[TResult](Func
3 action, Context context, CancellationToken cancellationToken, Func2 timeoutProvider, TimeoutStrategy timeoutStrategy, Func
5 onTimeoutAsync, Boolean continueOnCapturedContext)
at Polly.AsyncPolicy.ExecuteAsync(Func3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext) at Polly.Wrap.AsyncPolicyWrapEngine.ImplementationAsync(Func
3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext, IAsyncPolicy outerPolicy, IAsyncPolicy innerPolicy)
at Polly.AsyncPolicy.ExecuteAsync(Func3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext) at Polly.AsyncPolicy.ExecuteAndCaptureAsync(Func
3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
--- End of inner exception stack trace ---
at TestContainers.Container.Abstractions.WaitStrategies.ExposedPortsWaitStrategy.WaitUntil(IContainer container)
at TestContainers.Container.Abstractions.AbstractContainer.StartServices(CancellationToken ct)
Unhandled exception. Docker.DotNet.DockerContainerNotFoundException: Docker API responded with status code=NotFound, response={"message":"No such container: d933bcad92376dd67c89675e3c3883df855885de00e31e2244a9eaae750c546c"}at Docker.DotNet.ContainerOperations.<>c.<.cctor>b__33_0(HttpStatusCode statusCode, String responseBody)
(String[] args)
at Docker.DotNet.DockerClient.HandleIfErrorResponseAsync(HttpStatusCode statusCode, HttpResponseMessage response, IEnumerable1 handlers) at Docker.DotNet.DockerClient.MakeRequestForStreamAsync(IEnumerable
1 errorHandlers, HttpMethod method, String path, IQueryString queryString, IRequestContent body, IDictionary`2 headers, TimeSpan timeout, CancellationToken token)
at TestContainers.Container.Abstractions.AbstractContainer.PrintContainerLogs(CancellationToken ct)
at TestContainers.Container.Abstractions.AbstractContainer.StartServices(CancellationToken ct)
at TestContainers.Container.Abstractions.AbstractContainer.StartAsync(CancellationToken ct)
at TestContainers.Container.Abstractions.Reaper.ResourceReaper.StartAsync(IDockerClient dockerClient, ILoggerFactory loggerFactory)
at TestContainers.Container.Abstractions.GenericContainer.ContainerStarting()
at TestContainers.Container.Abstractions.AbstractContainer.StartAsync(CancellationToken ct)
at TestContainer.Program.Main(String[] args) in C:\Users\sro\RiderProjects\Playground\TestContainer\Program.cs:line 20
at TestContainer.Program.
Note: a colleague tried with more or less the same env, but it works as expected, so it's probably an issue with my env...
I'll let you know if I find the issue
Ok, the issue seem to be caused by port exclusion. I have a lot of TCP port excluded on my machine and it seem that the mapped port is chosen in this range
netsh interface ipv4 show excludedportrange protocol=tcp
Protocol tcp Port Exclusion Ranges
Start Port End Port
---------- --------
5700 5700
5985 5985
8884 8884
47001 47001
49152 49251
49252 49351
49553 49652
49755 49854
49855 49954
50000 50059 *
50060 50159
50160 50259
50260 50359
50360 50459
50772 50871
50872 50971
50972 51071
51072 51171
51172 51271
51272 51371
51372 51471
51555 51654
51750 51849 *
51850 51949
51950 52049
52050 52149
53286 53385
53386 53485
53486 53585
53586 53685
53686 53785
53786 53885
53886 53985
53986 54085
64142 64241
64242 64341
64342 64441
64442 64541
64542 64641
64642 64741
64742 64841
64957 65056
65057 65156
65157 65256
65257 65356
65357 65456
When tried, the chosen mapped port was 49173, which is in an excluded range.
The strange thing is that if I try to map this port when starting a docker image with the CLI, I have an exception and the container cannot start. So I don't know how docker.dotnet manage to start it
What is happening is that your MsSql container dies after being started. You suspicion is probably correct to say that maybe the exposed port is blocked.
What you can do is this:
- when the unit test is starting the container,
- type
docker ps -a
- this will list all containers, including the MsSql container that was started, but already died.
docker logs <your dead mssql container id or name>
to inspect it's logs
from the logs, you might find why the container is dead
The problem is that the container does not die in fact. For example for ryuk it starts, The log is something like 'Waiting for connexion on port 8080'
docker ps
reports that the port 8080 is mapped to port xxx, which is in an excluded range, but in fact the port is not open, at least not in Windows (probably it is open in the linux VM that hosts docker)
And finally, the test container timeouts because the mapped port is not reachable.
I haven't found how the random(?) mapped port is chosen. Is it done by testcontainers or by docker.dotnet?
Oh you mean TestContainers timeout when waiting to connect to ryku!
Ok i got it.
The random port is not selected by TestContainer or Docker.Dotnet.
It is selected by the docker engine.
The docker engine maps 8080 to a random port in the higher port range (> 32000). Now, I'm not sure how the docker engine does it, but my guess is to map it to port 0 and let the OS pick a port in the high port ranges.
Ok, so lot of people have the same problem docker/for-win#3171 (comment) / docker/for-win#11584
and nothing related to testcontainers.
Do you know if it is possible to define a port explicitly, to workaround this issue?
Unfortunately, there isn't currently a way to map it. You can, however, disable the reaper for now
Yes, but I have the same problem with the SQL container, it also takes an excluded port and TestContainers timeouts when trying to connect. But I've seen that the port binding is configurable for the DB, so everything is ok :)