Bug: Docker package in python works, but testcontainers crashes when awaiting ryuk in Drone CI
RomainMendez opened this issue · 6 comments
Describe the bug
In my Drone CI pipeline, I'm not able to execute my testcontainers, I've tried a variety of tweaks to my pipeline, here's what happens there :
kind: pipeline
type: docker
name: default
steps:
- name: test
image: docker:dind
volumes:
- name: dockersock
path: /var/run
commands:
- sleep 5 # give docker enough time to start
- docker ps -a
- docker info
- ls -la /run
- name: tests
image: python:3.11.7-bookworm
commands:
- pip install -r requirements.txt
- echo $PWD
- whoami
- ls -la /run
- pytest -v --cov=. --cov-report term --cov-fail-under=10 ./tests/
volumes:
- name: dockersock
path: /var/run
# Specify docker:dind as a service
services:
- name: docker
image: docker:dind
privileged: true
volumes:
- name: dockersock
path: /var/run
volumes:
- name: dockersock
temp: {}
In this pipeline, I'm able in the last step to run docker hello world but not testcontainers.
I'm logging it as "ERROR" tier just for me to see it more easily
To Reproduce
To reproduce you can run the above pipeline, but in the python step you can run the following code :
from testcontainers.core.container import DockerContainer
import docker
import logging
def test_basic_():
# Initialize the Docker client
client = docker.from_env()
# Pull the hello-world image (if not already present)
client.images.pull("hello-world")
# Run the hello-world container
container = client.containers.run("hello-world", detach=True)
# Wait for the container to finish execution
container.wait()
# Print the logs from the container
logs = container.logs()
logging.error(logs.decode("utf-8"))
# Clean up by removing the container
container.remove()
radicale = DockerContainer("tomsquest/docker-radicale:latest")
radicale.with_exposed_ports(5232)
radicale.with_name("radicale")
#radicale.with_kwargs(remove=True)
radicale.start()
Runtime environment
I'm running a Drone CI pipeline, using Docker-in-Docker as a service.
The error logs of the pipeline are here :
tests/isolated_test.py::test_basic_
-------------------------------- live log call ---------------------------------
2024-05-28 13:15:28 ERROR
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
2024-05-28 13:15:28 WARNING DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566
2024-05-28 13:15:28 INFO Pulling image testcontainers/ryuk:0.7.0
2024-05-28 13:15:32 INFO Container started: 00a44be379fd
2024-05-28 13:15:32 INFO Waiting for container <Container: 00a44be379fd> with image testcontainers/ryuk:0.7.0 to be ready ...
FAILED [ 50%]
[...]
=================================== FAILURES ===================================
_________________________________ test_basic_ __________________________________
def test_basic_():
# Initialize the Docker client
client = docker.from_env()
# Pull the hello-world image (if not already present)
client.images.pull("hello-world")
# Run the hello-world container
container = client.containers.run("hello-world", detach=True)
# Wait for the container to finish execution
container.wait()
# Print the logs from the container
logs = container.logs()
logging.error(logs.decode("utf-8"))
# Clean up by removing the container
container.remove()
radicale = DockerContainer("tomsquest/docker-radicale:latest")
radicale.with_exposed_ports(5232)
radicale.with_name("radicale")
#radicale.with_kwargs(remove=True)
> radicale.start()
[...]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:87: in start
Reaper.get_instance()
/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:193: in get_instance
Reaper._instance = Reaper._create_instance()
/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:247: in _create_instance
raise last_connection_exception
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'testcontainers.core.container.Reaper'>
@classmethod
def _create_instance(cls) -> "Reaper":
logger.debug(f"Creating new Reaper for session: {SESSION_ID}")
Reaper._container = (
DockerContainer(c.ryuk_image)
.with_name(f"testcontainers-ryuk-{SESSION_ID}")
.with_exposed_ports(8080)
.with_volume_mapping(c.ryuk_docker_socket, "/var/run/docker.sock", "rw")
.with_kwargs(privileged=c.ryuk_privileged, auto_remove=True)
.with_env("RYUK_RECONNECTION_TIMEOUT", c.ryuk_reconnection_timeout)
.start()
)
wait_for_logs(Reaper._container, r".* Started!")
container_host = Reaper._container.get_container_host_ip()
container_port = int(Reaper._container.get_exposed_port(8080))
last_connection_exception: Optional[Exception] = None
for _ in range(50):
try:
Reaper._socket = socket()
> Reaper._socket.connect((container_host, container_port))
E ConnectionRefusedError: [Errno 111] Connection refused
/usr/local/lib/python3.11/site-packages/testcontainers/core/container.py:233: ConnectionRefusedError
I'm thinking the targets for the Daemon between testcontainers and the underlying Python package are not the same (which is odd because I know it's a dependency). So I'm not sure why it's not coming up.
While I never used Drone myself, I know it has some challenges for doing docker-in-docker setups. So we have a specific plugin for it (https://github.com/testcontainers/dind-drone-plugin), could you give it a try?
I did, and it suffers from the same issue, while I didn't try the docker client directly for it, I was also not able to launch a testcontainer.
It's quite odd because I see it using the same docker daemon, I think somehow the IP/port resolution is wrong in this setting, any way to display the logs without using a patched version of testcontainer ?
I do have an issue opened there actually
I've done more digging, I can use the "DockerClient" primitive directly, so that means there's an issue in this particular setup with how the IP + Port for the Ryuk container is retrieved (will try to fork this repo, and create a patched version to add logging).
@kiview is there interest with adding more debug logging to the main repo for issues such as Ryuk not starting ? If so I could conclude with a pull request
I think I realize the issue at play here, the Docker-in-Docker service is running in another container, and Ryuk is looking on "localhost", so of course it fails, I'm figuring more out.
So I think I found the issue the issue is the following :
- I mounted the docker.sock file in the container
- The container thinks the hostname where it will be hosted is "localhost"
- Ryuk can't be reached because of it
Using a TLS connection to the docker:dind container makes it work !
The actual plugin doesn't do this, I'll see if I can get a PR there to fix this, thank you !