quay/mirror-registry

TASK [mirror_appliance : Wait for pg_trgm to be installed]

Closed this issue · 9 comments

fatal: [root@xxx]: FAILED! => {"attempts": 20, "changed": true, "cmd": ["podman", "exec", "-it", "quay-postgres", "/bin/bash", "-c", "echo 'CREATE EXTENSION IF NOT EXISTS pg_trgm' | psql -d quay -U postgres"], "delta": "0:00:00.078205", "end": "2022-10-31 23:48:40.746969", "msg": "non-zero return code", "rc": 125, "start": "2022-10-31 23:48:40.668764", "stderr": "Error: no container with name or ID "quay-postgres" found: no such container", "stderr_lines": ["Error: no container with name or ID "quay-postgres" found: no such container"], "stdout": "", "stdout_lines": []}

The process does not complete

@aladrocMatiner Which version is this happening on? I think we recently patched this in on the main branch and should be available in a new release soon.

Hello. I am having the same issue using:

registry.redhat.io/rhel8/postgresql-10 1-195.1665590956 f7e37b4288ac 8 minutes ago 603 MB

I am taking the lastest release:

https://github.com/quay/mirror-registry/releases/latest/download/mirror-registry-online.tar.gz

It seems that the root cause is this:

Nov 15 12:29:04 hv10.telco5gran.eng.rdu2.redhat.com systemd[1]: Starting PostgreSQL Podman Container for Quay...
Nov 15 12:29:04 hv10.telco5gran.eng.rdu2.redhat.com systemd[1]: Started PostgreSQL Podman Container for Quay.
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com podman[443008]: time="2022-11-15T12:29:05-05:00" level=error msg="Starting some container dependencies"
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com podman[443008]: time="2022-11-15T12:29:05-05:00" level=error msg="\"runc: runc create failed: unable to start container process: exec: \\\"sleep\\\": executable file not found in $PATH: OCI runtime attempted to invoke a command that was not found\""
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com podman[443008]: Error: error starting some containers: internal libpod error
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com systemd[1]: quay-postgres.service: Main process exited, code=exited, status=126/n/a
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com podman[443461]: daa642ea2bb0809dd8d9e69b1e093581b3139cfd0a5344286691ec4bcd9994a4
Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com systemd[1]: quay-postgres.service: Failed with result 'exit-code'.

This is the systemd unit:

# /etc/systemd/system/quay-postgres.service
[Unit]
Description=PostgreSQL Podman Container for Quay
Wants=network.target
After=network-online.target quay-pod.service
Requires=quay-pod.service

[Service]
Type=simple
TimeoutStartSec=5m
ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid
ExecStart=/usr/bin/podman run \
    --name quay-postgres \
    -v /opt/assets/quay-install/pg-data:/var/lib/pgsql/data:Z \
    -e POSTGRESQL_USER=user \
    -e POSTGRESQL_PASSWORD=password \
    -e POSTGRESQL_DATABASE=quay \
    --pod=quay-pod \
    --conmon-pidfile %t/%n-pid \
    --cidfile %t/%n-cid \
    --cgroups=no-conmon \
    --replace \
    registry.redhat.io/rhel8/postgresql-10:1-195.1665590956

ExecStop=/usr/bin/podman stop --ignore --cidfile %t/%n-cid -t 10
ExecStopPost=/usr/bin/podman rm --ignore -f --cidfile %t/%n-cid
PIDFile=%t/%n-pid
KillMode=none
Restart=always
RestartSec=30

[Install]
WantedBy=multi-user.target default.target

Additional info:

runc --version
runc version 1.1.4
spec: 1.0.2-dev
go: go1.18.7
libseccomp: 2.5.2

More findings:

$ ##################
$ # With --pod=quay-pod
$ ##################
$ podman run --name quay-postgres -v /etc/quay-install/pg-data:/var/lib/pgsql/data:Z -e POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=password -e POSTGRESQL_DATABASE=quay --pod=quay-pod --conmon-pidfile /run/quay-postgres.service-pid --cidfile /run/quay-postgres.service-cid --cgroups=no-conmon --replace registry.redhat.io/rhel8/postgresql-10:1-202.1666660384
84cf356f028fa08d0419f678b23d8edb68de36241c4c4cc7278ed453d720663f
Error: container id file exists. Ensure another container is not using it or delete /run/quay-postgres.service-cid

$ #####################
$ # Without --pod=quay-pod 
$ #####################
$ podman run --name quay-postgres -v /etc/quay-install/pg-data:/var/lib/pgsql/data:Z -e POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=password -e POSTGRESQL_DATABASE=quay --conmon-pidfile /run/quay-postgres.service-pid --cidfile /run/quay-postgres.service-cid --cgroups=no-conmon --replace registry.redhat.io/rhel8/postgresql-10:1-202.1666660384
waiting for server to start....2022-11-16 10:18:27.041 UTC [22] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-11-16 10:18:27.042 UTC [22] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-11-16 10:18:27.046 UTC [22] LOG:  redirecting log output to logging collector process
2022-11-16 10:18:27.046 UTC [22] HINT:  Future log output will appear in directory "log".
 done
server started
/var/run/postgresql:5432 - accepting connections
=> sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
ALTER ROLE
waiting for server to shut down.... done
server stopped
Starting server...
2022-11-16 10:18:27.258 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-11-16 10:18:27.258 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2022-11-16 10:18:27.258 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-11-16 10:18:27.258 UTC [1] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-11-16 10:18:27.263 UTC [1] LOG:  redirecting log output to logging collector process
2022-11-16 10:18:27.263 UTC [1] HINT:  Future log output will appear in directory "log".

@ccardenosa Thanks for chiming in and the PR. This is strange. The pause image updated recently and the entrypoint changed from /pause to /sleep. See this PR . It looks like mirror registry is failing to start because it can't find sleep in the pause image.

Nov 15 12:29:05 hv10.telco5gran.eng.rdu2.redhat.com podman[443008]: time="2022-11-15T12:29:05-05:00" level=error msg="\"runc: runc create failed: unable to start container process: exec: \\\"sleep\\\": executable file not found in $PATH: OCI runtime attempted to invoke a command that was not found\""

Since this is the online version, can you confirm which pause image is being used? It would be helpful to share the entire output of docker inspect. I just pulled the pause image we should be using in the offline build and it looks like it has the correct binaries:

$ docker run --rm -it --entrypoint sh registry.access.redhat.com/ubi8/pause:8.6-21
sh-4.4# date && sleep 1 && date
Wed Nov 16 13:35:01 UTC 2022
Wed Nov 16 13:35:02 UTC 2022

I wonder if this is an issue with podman not updating the pause image to the latest version? Which version of podman are you using and when was the last time it was updated?

Hello @HammerMeetNail

Here you have the requested info:

podman version
Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.7
Built:        Wed Oct 26 15:23:47 2022
OS/Arch:      linux/amd64

This is the procedure I've followed:

$ ##########################
$ Clean the image I had
$ ##########################
$ podman image rm registry.access.redhat.com/ubi8/pause:8.6-21
Untagged: registry.access.redhat.com/ubi8/pause:8.6-21
Deleted: c1fdfbf09a8e2a216412de18ec6321772636c17938c77ae7c2c1eda33c3854a2

$ ##########################
$ Download it again
$ ##########################
$ podman pull registry.access.redhat.com/ubi8/pause:8.6-21
Trying to pull registry.access.redhat.com/ubi8/pause:8.6-21...
Getting image source signatures
Checking if image destination supports signatures
Copying blob a2a6c9424401 done
Copying blob 1b3417e31a5e done
Copying blob 809fe483e885 done
Copying config 394e88b577 done
Writing manifest to image destination
Storing signatures
394e88b57705f34b38d6b04d21d8c8702a261414c385b79391b41507aeb19c11

$ ##########################
$ Run your test using podman client instead of docker
$ ##########################
$ podman run --rm -it --entrypoint sh registry.access.redhat.com/ubi8/pause:8.6-21
sh-4.4# date && sleep 1 && date
Wed Nov 16 14:31:07 UTC 2022
Wed Nov 16 14:31:08 UTC 2022
sh-4.4#

Here you have the inspect info:

$ podman inspect registry.access.redhat.com/ubi8/pause:8.6-21 > pause-8.6-21.inspect.json

Please be informed that I did run the same kind of tests. What I saw is that the problem wasn't the container images, but something related to pod option. In fact, it works fine when you remove such option as I am showing here

Finally, I tested it using offline compilation:

$ make build-offline-zip

@ccardenosa Thanks. Are there any other pause images on the machine? Looking at what you posted here, it looks like with the --pod=quay-pod option, it's failing but without it everything starts.

I wonder if there's an existing pod with an old, existing pause image and by passing in --pod, podman is picking an old pause image.

I just triggered a new release and CI is able to build, install and test both online and offline archives on a fresh VM, so I'm leaning towards something lingering on your existing machine.

@HammerMeetNail you are right. My env is the root cause of this issue:

ls -l /usr/bin/*tar
-rw-r--r--. 1 root root 284M Nov  1 09:56 /usr/bin/execution-environment.tar
-rw-r--r--. 1 root root 2.0G Sep 12 17:46 /usr/bin/image-archive.tar

Then, previous installed images are used...

 Build, Store, and Distribute your Containers

INFO[2022-11-16 10:45:15] Install has begun
DEBU[2022-11-16 10:45:15] Ansible Execution Environment Image: quay.io/quay/mirror-registry-ee:latest
DEBU[2022-11-16 10:45:15] Pause Image: registry.access.redhat.com/ubi8/pause:8.6-21
DEBU[2022-11-16 10:45:15] Quay Image: registry.redhat.io/quay/quay-rhel8:v3.7.9
DEBU[2022-11-16 10:45:15] Redis Image: registry.redhat.io/rhel8/redis-6:1-78.1665590931
DEBU[2022-11-16 10:45:15] Postgres Image: registry.redhat.io/rhel8/postgresql-10:1-195.1665590956
INFO[2022-11-16 10:45:15] Found execution environment at /usr/bin/execution-environment.tar
INFO[2022-11-16 10:45:15] Loading execution environment from execution-environment.tar
DEBU[2022-11-16 10:45:15] Importing execution enviornment with command: /bin/bash -c sudo /usr/bin/podman image import \
                                        --change 'ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' \
                                        --change 'ENV HOME=/home/runner' \
                                        --change 'ENV container=oci' \
                                        --change 'ENTRYPOINT=["entrypoint"]' \
                                        --change 'WORKDIR=/runner' \
                                        --change 'EXPOSE=6379' \
                                        --change 'VOLUME=/runner' \
                                        --change 'CMD ["ansible-runner", "run", "/runner"]' \
                                        - quay.io/quay/mirror-registry-ee:latest < /usr/bin/execution-environment.tar
Getting image source signatures
Copying blob 5aaf24dcde46 skipped: already exists
Copying config 271c6e1fd4 done
Writing manifest to image destination
Storing signatures
sha256:271c6e1fd4908b3f2bee3fb9f9fc81662422a0f98be0aed802c2e07547e415c7
INFO[2022-11-16 10:45:17] Detected an installation to localhost
INFO[2022-11-16 10:45:17] Found SSH key at /root/.ssh/quay_installer
INFO[2022-11-16 10:45:17] Attempting to set SELinux rules on /root/.ssh/quay_installer
INFO[2022-11-16 10:45:17] Found image archive at /usr/bin/image-archive.tar
INFO[2022-11-16 10:45:17] Detected an installation to localhost
INFO[2022-11-16 10:45:17] Unpacking image archive from /usr/bin/image-archive.tar
quay.tar
redis.tar
postgres.tar
pause.tar
INFO[2022-11-16 10:45:18] Loading pause image archive from pause.tar
DEBU[2022-11-16 10:45:18] Importing Pause with command: /bin/bash -c sudo /usr/bin/podman image import \
                                        --change 'ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' \
                                        --change 'ENV container=oci' \
                                        --change 'ENTRYPOINT=["sleep"]' \
                                        --change 'CMD=["infinity"]' \
                                        - registry.access.redhat.com/ubi8/pause:8.6-21 < pause.tar

Following your advise I've repeated the installation into a fresh env and it is working.

Thank you very much for your support.

BR,
Carlos.

Wonderful! Glad to help, reach out if you need anything else.