cisagov/Malcolm

Error in /etc/supervisord.conf prevents wise service from starting up and arkime container unhealthy

shift-f10 opened this issue · 4 comments

Starting up the stack and inspecting the arkime container (ghcr.io/idaholab/malcolm/arkime:23.04.0) outputs the error when spawning wise via supervisord:

spawnerr: unknown error making dispatchers for 'wise': EACCES

Entering the container with shell and inspecting the config file show Wise config is as follows:

[program:wise] command=/opt/wise_service.sh startsecs=0 startretries=0 stopasgroup=true killasgroup=true directory=%(ENV_ARKIME_DIR)s/wiseService stdout_logfile=%(ENV_ARKIME_DIR)s/logs/wise.log redirect_stderr=true

Workaround: Copy the supervisord.conf file and change lines to below and bind-mount the file in the docker-compose.yml fixes the issue and arkime boots without issues:

stdout_logfile=/dev/fd/1 stdout_logfile_maxbytes=0

Please rebuild the docker container for arkime with the correct values of the log file so that wise can start and not hung waiting for connection to 8081.

Interesting, normally this line in the docker-compose file (setting tty to true, accompanied by setting stdin_open to false since it's not really a tty) prevents that exact error that you're talking about.

What are the details of your system (operating system, docker version, etc.)? Prior to release last night and this morning I ran these existing images without incident so I'm interested to know how your setup differs from mine.

Running Ubuntu 20.04 5.4.0-146-generic, Docker version 23.0.2, build 569dd73 and Docker Compose version v2.16.

One other thing to check: is it possible that somehow this directory got created that in such a way that the arkime container didn't have permissions to write to it? Are all of the files underneath the Malcolm install dir owned by your user, and does your user's UID match the PUID/PGID variables at the top of the docker-compose file? (find . ! -user $(id -u) or something like that should tell you).

So initially I used root to pull down the tar and untar it. I then chown -hR malcolm:malcolm malcolm/ to ensure the user will have all the correct permissions and su malcolm and proceeded to set it up. The output of the command is as follows:

find . ! -user $(id -u)
./scripts/pycache
./scripts/pycache/malcolm_common.cpython-38.pyc

I deleted the folder and re-did the entire process and surprisingly enough, the issue no longer existed. Not sure what happened. I used the same process as previous but somehow the outcome is diff. Thanks for the quick response. You can close this now.