palantir/docker-compose-rule

isListeningNow fails for docker in docker container

Opened this issue · 2 comments

I'm doing a healthcheck for docker container running inside a docker container. Everything works as expected but not the health-check. The docker-compose.yml file is executed and i can see the running containers. Some tests inside the base-container to access the hartbeat of first-container works.

curl first-container:8080/heartbeat

returns "true" - as expected.

    @ClassRule
    public static final DockerComposeRule dockerRule = DockerComposeRule.builder()
                                                                        .saveLogsTo("mylogs")
                                                                        .file("docker-compose.yml")
                                                                        .pullOnStartup(true)
                                                                        .removeConflictingContainersOnStartup(true)
                                                                        .shutdownStrategy(ShutdownStrategy.KILL_DOWN)
                                                                        .waitingForService("first-container",
                                                                                           HealthChecks.toRespond2xxOverHttp(8080,
                                                                                                                             (port) -> port.inFormat(
                                                                                                                                     "http://$HOST:$EXTERNAL_PORT/heartbeat")))
                                                                        .waitingForService("second-container",
                                                                                           HealthChecks.toRespond2xxOverHttp(8080,
                                                                                                                             (port) -> port.inFormat(
                                                                                                                                     "http://$HOST:$EXTERNAL_PORT/hartbeat")))
                                                                        .nativeServiceHealthCheckTimeout(new Duration(300000))
                                                                        .build();

I get the following stack-trace


java.lang.IllegalStateException: The cluster failed to pass a startup check: 8080 is not listening
	at com.palantir.docker.compose.connection.waiting.ClusterWait.waitUntilReady(ClusterWait.java:50)
	at com.palantir.docker.compose.DockerComposeRule.lambda$before$0(DockerComposeRule.java:155)
	at java.lang.Iterable.forEach(Iterable.java:75)
	at com.palantir.docker.compose.DockerComposeRule.before(DockerComposeRule.java:155)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:46)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
	at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
	at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
	at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
	at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:147)
	at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:129)
	at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
	at org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:46)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

After a closer look inside the source, i think the problem is here

    public SuccessOrFailure portIsListeningOnHttp(int internalPort, Function<DockerPort, String> urlFunction, boolean andCheckStatus) {
        try {
            DockerPort port = port(internalPort);
            if (!port.isListeningNow()) {
                return SuccessOrFailure.failure(internalPort + " is not listening");
            }
            if (!port.isHttpResponding(urlFunction, andCheckStatus)) {
                return SuccessOrFailure.failure(internalPort + " does not have a http response from " + urlFunction.apply(port));
            }
            return SuccessOrFailure.success();
        } catch (Exception e) {
            return SuccessOrFailure.fromException(e);
        }
    }
    public boolean isListeningNow() {
        try (Socket socket = new Socket()) {
            socket.connect(new InetSocketAddress(ip, getExternalPort()), 500);
            log.trace("External Port '{}' on ip '{}' was open", getExternalPort(), ip);
            return true;
        } catch (IOException e) {
            return false;
        }
    }

The socket connection does not work. I tried this by my self with a small test-application and the connection to the ip/port, as set in the rule, works as expected. Maybe it would help to add some more log output, even i case something failed!

Does anyone have any suggestions how to solve this 'waitingForService'-problem? Thank you!

Instead of using the healthchecks defined in code, I would suggest implementing your healthchecks in docker-compose since it is now supported since version 2.1. For your example, you could use the following docker-compose directive in your file:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost/heartbeat"]
  interval: 1s
  timeout: 5s
  retries: 12

Encountered this in palantir/atlasdb#3461 - manually hitting the endpoints seemed to work, but was getting failures (and yeah we use a docker in docker setup)