testcontainers/testcontainers-python

Bug: Docker package in python works, but testcontainers crashes when awaiting ryuk in Drone CI

RomainMendez opened this issue · 6 comments

Describe the bug
In my Drone CI pipeline, I'm not able to execute my testcontainers, I've tried a variety of tweaks to my pipeline, here's what happens there :

kind: pipeline
type: docker
name: default

steps:
  - name: test
    image: docker:dind
    volumes:
    - name: dockersock
      path: /var/run
    commands:
    - sleep 5 # give docker enough time to start
    - docker ps -a
    - docker info
    - ls -la /run

  - name: tests
    image: python:3.11.7-bookworm
    commands:
      - pip install -r requirements.txt
      - echo $PWD
      - whoami
      - ls -la /run
      - pytest -v --cov=. --cov-report term --cov-fail-under=10 ./tests/
    volumes:
      - name: dockersock
        path: /var/run

# Specify docker:dind as a service
services:
- name: docker
  image: docker:dind
  privileged: true
  volumes:
  - name: dockersock
    path: /var/run

volumes:
- name: dockersock
  temp: {}

In this pipeline, I'm able in the last step to run docker hello world but not testcontainers.
I'm logging it as "ERROR" tier just for me to see it more easily

To Reproduce
To reproduce you can run the above pipeline, but in the python step you can run the following code :

from testcontainers.core.container import DockerContainer
import docker
import logging

def test_basic_():
    # Initialize the Docker client
    client = docker.from_env()

    # Pull the hello-world image (if not already present)
    client.images.pull("hello-world")

    # Run the hello-world container
    container = client.containers.run("hello-world", detach=True)

    # Wait for the container to finish execution
    container.wait()

    # Print the logs from the container
    logs = container.logs()
    logging.error(logs.decode("utf-8"))

    # Clean up by removing the container
    container.remove()
        
    radicale = DockerContainer("tomsquest/docker-radicale:latest")
    radicale.with_exposed_ports(5232)
    radicale.with_name("radicale")
    #radicale.with_kwargs(remove=True)
    
    radicale.start()

Runtime environment
I'm running a Drone CI pipeline, using Docker-in-Docker as a service.
The error logs of the pipeline are here :

tests/isolated_test.py::test_basic_ 

-------------------------------- live log call ---------------------------------

2024-05-28 13:15:28 ERROR 

Hello from Docker!

This message shows that your installation appears to be working correctly.


To generate this message, Docker took the following steps:

 1. The Docker client contacted the Docker daemon.

 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.

    (amd64)

 3. The Docker daemon created a new container from that image which runs the

    executable that produces the output you are currently reading.

 4. The Docker daemon streamed that output to the Docker client, which sent it

    to your terminal.


To try something more ambitious, you can run an Ubuntu container with:

 $ docker run -it ubuntu bash


Share images, automate workflows, and more with a free Docker ID:

 https://hub.docker.com/


For more examples and ideas, visit:

 https://docs.docker.com/get-started/



2024-05-28 13:15:28 WARNING DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566

2024-05-28 13:15:28 INFO Pulling image testcontainers/ryuk:0.7.0

2024-05-28 13:15:32 INFO Container started: 00a44be379fd

2024-05-28 13:15:32 INFO Waiting for container <Container: 00a44be379fd> with image testcontainers/ryuk:0.7.0 to be ready ...

FAILED                                                                   [ 50%]

[...]

=================================== FAILURES ===================================

_________________________________ test_basic_ __________________________________


    def test_basic_():

        # Initialize the Docker client

        client = docker.from_env()

    

        # Pull the hello-world image (if not already present)

        client.images.pull("hello-world")

    

        # Run the hello-world container

        container = client.containers.run("hello-world", detach=True)

    

        # Wait for the container to finish execution

        container.wait()

    

        # Print the logs from the container

        logs = container.logs()

        logging.error(logs.decode("utf-8"))

    

        # Clean up by removing the container

        container.remove()

    

        radicale = DockerContainer("tomsquest/docker-radicale:latest")

        radicale.with_exposed_ports(5232)

        radicale.with_name("radicale")

        #radicale.with_kwargs(remove=True)

    

>       radicale.start()

[...]

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:87: in start

    Reaper.get_instance()

/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:193: in get_instance

    Reaper._instance = Reaper._create_instance()

/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:247: in _create_instance

    raise last_connection_exception

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 


cls = <class 'testcontainers.core.container.Reaper'>


    @classmethod

    def _create_instance(cls) -> "Reaper":

        logger.debug(f"Creating new Reaper for session: {SESSION_ID}")

    

        Reaper._container = (

            DockerContainer(c.ryuk_image)

            .with_name(f"testcontainers-ryuk-{SESSION_ID}")

            .with_exposed_ports(8080)

            .with_volume_mapping(c.ryuk_docker_socket, "/var/run/docker.sock", "rw")

            .with_kwargs(privileged=c.ryuk_privileged, auto_remove=True)

            .with_env("RYUK_RECONNECTION_TIMEOUT", c.ryuk_reconnection_timeout)

            .start()

        )

        wait_for_logs(Reaper._container, r".* Started!")

    

        container_host = Reaper._container.get_container_host_ip()

        container_port = int(Reaper._container.get_exposed_port(8080))

    

        last_connection_exception: Optional[Exception] = None

        for _ in range(50):

            try:

                Reaper._socket = socket()

>               Reaper._socket.connect((container_host, container_port))

E               ConnectionRefusedError: [Errno 111] Connection refused


/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:233: ConnectionRefusedError

I'm thinking the targets for the Daemon between testcontainers and the underlying Python package are not the same (which is odd because I know it's a dependency). So I'm not sure why it's not coming up.

While I never used Drone myself, I know it has some challenges for doing docker-in-docker setups. So we have a specific plugin for it (https://github.com/testcontainers/dind-drone-plugin), could you give it a try?

I did, and it suffers from the same issue, while I didn't try the docker client directly for it, I was also not able to launch a testcontainer.
It's quite odd because I see it using the same docker daemon, I think somehow the IP/port resolution is wrong in this setting, any way to display the logs without using a patched version of testcontainer ?

I've done more digging, I can use the "DockerClient" primitive directly, so that means there's an issue in this particular setup with how the IP + Port for the Ryuk container is retrieved (will try to fork this repo, and create a patched version to add logging).
@kiview is there interest with adding more debug logging to the main repo for issues such as Ryuk not starting ? If so I could conclude with a pull request

I think I realize the issue at play here, the Docker-in-Docker service is running in another container, and Ryuk is looking on "localhost", so of course it fails, I'm figuring more out.

So I think I found the issue the issue is the following :

  • I mounted the docker.sock file in the container
  • The container thinks the hostname where it will be hosted is "localhost"
  • Ryuk can't be reached because of it

Using a TLS connection to the docker:dind container makes it work !
The actual plugin doesn't do this, I'll see if I can get a PR there to fix this, thank you !