DigitalSlideArchive/digital_slide_archive

Docker installation - dsa_worker container constantly restarting

Closed this issue · 9 comments

Hello all,

I am attempting to follow the instructions to install via docker/ansible and running into an issue where the dsa_worker container seems to be constantly restarting.

Checking the logs, found this error repeated many times:
Error: The host user id is in use and does not match the current user id
bash: /opt/logs/worker.log: Permission denied

Any suggestions? Thanks in advance!

What OS are you using? Is your current user a member of the docker group? Are you running as root (you shouldn't be)?

The dsa_worker needs permissions to access the socket used by docker, and somehow it isn't getting it. There are some weird subtleties of accessing docker from within docker, so inside the docker container the uid and gid are switched to your uid and gid. Is your uid the same root inside the docker container, then perhaps that causes this.

Thanks for the quick response!

I'm using Ubuntu 18.04, my current user is a member of the docker group, and I'm not running as root.

So I tried the docker-compose example, and that worked no problem. My problem seems to be only with the installation as described here: https://github.com/DigitalSlideArchive/digital_slide_archive/tree/master/ansible

I'm not sure what was causing the problem with the other approach, but this is resolved on my end.

I'm also having the same issue and I cannot get the docker-compose to work either (with docker-compose I get FileNotOpen: Failed to open "/var/log/mongodb/mongodb.log").

This is on a RHEL 7 server.

@btsherid Regarding docker-compose, by default logs are written to the devops/dsa/logs directory. Is it writable? Did you set CURRENT_UID? If you ran it once without doing so, the logs would be created by root (the root user inside the docker container), and then subsequent runs might fail due to permissions.

For deploy_docker.py, are you seeing the same log messages as @ellenemerson?

Hi,

I appreciate the quick responses.

The docker-compose way is working for me now.

The mongodb issue appears to have been caused by permissions on devops/dsa/logs. I had to change the ownership of the folder from root to polkitd.

The dsa_worker issue for me (trying to run as root) was resolved by adding this to docker-compose.yml:

environment:
- C_FORCE_ROOT=true

It looks like the above is the reason the worker container is continually restarting with the deploy_docker method, but I'm not sure how to make that change with that method.

Here's the logs from using deploy_docker:

/usr/local/lib/python3.7/site-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

uid=uid, euid=euid, gid=gid, egid=egid,
/usr/local/lib/python3.7/site-packages/celery/backends/amqp.py:67: CPendingDeprecationWarning:
The AMQP result backend is scheduled for deprecation in version 4.0 and removal in version v5.0. Please use RPC backend or a persistent backend.

alternative='Please use RPC backend or a persistent backend.')

-------------- celery@be900d7896ff v4.4.2 (cliffs)
--- ***** -----
-- ******* ---- Linux-3.10.0-1062.18.1.el7.x86_64-x86_64-with-debian-10.3 2020-04-29 14:06:53

  • *** --- * ---
  • ** ---------- [config]
  • ** ---------- .> app: girder_worker:0x7fbd84f7ba90
  • ** ---------- .> transport: amqp://guest:**@rabbitmq:5672//
  • ** ---------- .> results: amqp://
  • *** --- * --- .> concurrency: 2 (prefork)
    -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
    --- ***** -----
    -------------- [queues]
    .> celery exchange=celery(direct) key=celery

[2020-04-29 14:06:53,552: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbitmq:5672//: [Errno 111] Connection refused.
Trying again in 2.00 seconds... (1/100)

[2020-04-29 14:06:55,587: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbitmq:5672//: [Errno 111] Connection refused.
Trying again in 4.00 seconds... (2/100)

I would consider my issue resolved as this is clearly caused by running as root, and I have gotten around that using the docker compose method.

Thanks,
Brendan

I don't expect either deploy_docker.py or docker-compose will work properly run as root or as a user with uid 0 or gid 0 (even if that user is somehow not root). Further, if you run them as root once, you probably have to manually clean up permissions on the storage directories (default is ~/.dsa for deploy_docker.py and devops/dsa for docker-compose).

I don't expect either deploy_docker.py or docker-compose will work properly run as root or as a user with uid 0 or gid 0 (even if that user is somehow not root). Further, if you run them as root once, you probably have to manually clean up permissions on the storage directories (default is ~/.dsa for deploy_docker.py and devops/dsa for docker-compose).

I think this would warrant documenting in ansible/README.rst for oldtimers such as myself who never got the hang of using sudo for selected commands (popularized by Ubuntu) and still rely on using an entire shell running as root for installation tasks... :-)

We are eventually going to just support docker-compose rather than the deploy_docker.py command, as it is more standard. There is now a note about not running deploy_docker.py as root in the README.rst file, so I'm closing this.