shibauthorizer and shibresponder cannot be started by supervisord
chankongching opened this issue · 8 comments
Description of Issue/Question
Trying to install shibboleth FastCGI support and after recompiling shibboleth.x86_64 2.6.0-2.2.el6, shibresponder and shibauthorizer exists.
[root@ip-10-0-0-33 shibboleth]# ls -lrt /usr/lib64/shibboleth/
total 912
-rwxr-xr-x 1 root root 41288 Oct 1 18:17 shibresponder
-rwxr-xr-x 1 root root 48856 Oct 1 18:17 shibauthorizer
-rwxr-xr-x 1 root root 240808 Oct 1 18:17 plugins.so
-rwxr-xr-x 1 root root 104112 Oct 1 18:17 plugins-lite.so
-rwxr-xr-x 1 root root 83096 Oct 1 18:17 odbc-store.so
-rwxr-xr-x 1 root root 139472 Oct 1 18:17 mod_shib_22.so
-rwxr-xr-x 1 root root 159304 Oct 1 18:17 adfs.so
-rwxr-xr-x 1 root root 99120 Oct 1 18:17 adfs-lite.so
Setup
./shibboleth-rebuild.sh
rpm -ivh rpmbuild/RPMS/x86_64/shibboleth-2.6.0-2.2.el6.x86_64.rpm
Steps to Reproduce Issue
on supervisord log, it is just telling me the process is exited
2017-10-03 09:35:13,064 INFO spawned: 'shibauthorizer' with pid 8454
2017-10-03 09:35:13,156 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:13,156 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:14,158 INFO spawned: 'shibauthorizer' with pid 8457
2017-10-03 09:35:14,238 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:14,239 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:16,242 INFO spawned: 'shibauthorizer' with pid 8460
2017-10-03 09:35:16,323 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:16,323 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:19,328 INFO spawned: 'shibauthorizer' with pid 8463
2017-10-03 09:35:19,408 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:35:19,408 INFO received SIGCLD indicating a child quit
2017-10-03 09:35:20,409 INFO gave up: shibauthorizer entered FATAL state, too many start retries too quickly
2017-10-03 09:36:59,807 INFO localhost:0 - - [03/Oct/2017:09:36:59 +0000] "POST /RPC2 HTTP/1.0" 200 254
2017-10-03 09:36:59,809 INFO spawned: 'shibauthorizer' with pid 8495
2017-10-03 09:36:59,901 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:36:59,901 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:00,903 INFO spawned: 'shibauthorizer' with pid 8498
2017-10-03 09:37:00,986 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:00,987 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:02,990 INFO spawned: 'shibauthorizer' with pid 8501
2017-10-03 09:37:03,074 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:03,074 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:06,079 INFO spawned: 'shibauthorizer' with pid 8504
2017-10-03 09:37:06,165 INFO exited: shibauthorizer (exit status 0; not expected)
2017-10-03 09:37:06,165 INFO received SIGCLD indicating a child quit
2017-10-03 09:37:07,167 INFO gave up: shibauthorizer entered FATAL state, too many start retries too quickly
Versions and Systems
(nginx -V
, shibd -v
(and compile options), OS type and version)
nginx version: nginx/1.12.1
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)
built with OpenSSL 1.0.1e-fips 11 Feb 2013
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'
shibd -v
shibboleth 2.6.0
cat /etc/redhat-release
CentOS release 6.9 (Final)
here is my supervisord.conf
[unix_http_server]
file=/var/run/supervisord/supervisor.sock ; UNIX socket 文件,supervisorctl 会使用
;chmod=0700 ; socket 文件的 mode,默认是 0700
;chown=nobody:nogroup ; socket 文件的 owner,格式: uid:gid
;[inet_http_server] ; HTTP 服务器,提供 web 管理界面
;port=127.0.0.1:9001 ; Web 管理后台运行的 IP 和端口,如果开放到公网,需要注意安全性
;username=user ; 登录管理后台的用户名
;password=123 ; 登录管理后台的密码
[supervisord]
http_port=/var/run/supervisord/supervisor.sock
logfile=/var/log/supervisor/supervisord.log ; 日志文件,默认是 $CWD/supervisord.log
logfile_maxbytes=50MB ; 日志文件大小,超出会 rotate,默认 50MB
logfile_backups=10 ; 日志文件保留备份数量默认 10
loglevel=info ; 日志级别,默认 info,其它: debug,warn,trace
pidfile=/var/run/supervisord/supervisord.pid ; pid 文件
nodaemon=false ; 是否在前台启动,默认是 false,即以 daemon 的方式启动
minfds=1024 ; 可以打开的文件描述符的最小值,默认 1024
minprocs=200 ; 可以打开的进程数的最小值,默认 200
; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///var/run/supervisord/supervisor.sock ; 通过 UNIX socket 连接 supervisord,路径与 unix_http_server 部分的 file 一致
;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord
; 包含其他的配置文件
;[include]
;files = /etc/supervisord/conf.d/shib.conf
;files = /etc/supervisord/conf.d/*.conf
[program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
socket=unix:///opt/shibboleth/shibauthorizer.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibauthorizer.log
stderr_logfile=/var/log/supervisor/shibauthorizer.error.log
[program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder
socket=unix:///opt/shibboleth/shibresponder.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibresponder.log
stderr_logfile=/var/log/supervisor/shibresponder.error.log
Your config needs to tell Supervisor you're running FastCGI applications, eg:
[fcgi-program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
...
[fcgi-program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder
rather than the [program:]
sections you were using, where Supervisor is trying to run the applications normally. Docs at http://supervisord.org/configuration.html#fcgi-program-x-section-settings.
Also, I'd recommend not running shibauthorizer
/shibresponder
as root and you can reduce the permissions on its socket to 0660
and socket owned by a corresponding user/group. I kept my applications and sockets run as / owned by the shibd
user, to match what the shibd
process does. This way, in case any shared files (eg logs), there won't be any permission issues if the FastCGI apps create/modify any of permissions/ownership on these files.
Also, in future this type of issue/question is best reported at https://github.com/nginx-shib/shibboleth-fastcgi as it's related to the FastCGI applications. Thanks!
i have updated my supervisord.conf to the following, as u mentioned, chaning to fcgi-program but i still cannot make the socket appears. What log should i check to get further?
[unix_http_server]
file=/var/run/supervisord/supervisor.sock ; UNIX socket 文件,supervisorctl 会使用
;chmod=0700 ; socket 文件的 mode,默认是 0700
;chown=nobody:nogroup ; socket 文件的 owner,格式: uid:gid
;[inet_http_server] ; HTTP 服务器,提供 web 管理界面
;port=127.0.0.1:9001 ; Web 管理后台运行的 IP 和端口,如果开放到公网,需要注意安全性
;username=user ; 登录管理后台的用户名
;password=123 ; 登录管理后台的密码
[supervisord]
http_port=/var/run/supervisord/supervisor.sock
logfile=/var/log/supervisor/supervisord.log ; 日志文件,默认是 $CWD/supervisord.log
logfile_maxbytes=50MB ; 日志文件大小,超出会 rotate,默认 50MB
logfile_backups=10 ; 日志文件保留备份数量默认 10
loglevel=info ; 日志级别,默认 info,其它: debug,warn,trace
pidfile=/var/run/supervisord/supervisord.pid ; pid 文件
nodaemon=false ; 是否在前台启动,默认是 false,即以 daemon 的方式启动
minfds=1024 ; 可以打开的文件描述符的最小值,默认 1024
minprocs=200 ; 可以打开的进程数的最小值,默认 200
; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///var/run/supervisord/supervisor.sock ; 通过 UNIX socket 连接 supervisord,路径与 unix_http_server 部分的 file 一致
;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord
; 包含其他的配置文件
;[include]
;files = /etc/supervisord/conf.d/shib.conf
;files = /etc/supervisord/conf.d/*.conf
[fcgi-program:shibauthorizer]
command=/usr/lib64/shibboleth/shibauthorizer
socket=unix:///opt/shibboleth/shibauthorizer.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibauthorizer.log
stderr_logfile=/var/log/supervisor/shibauthorizer.error.log
[fcgi-program:shibresponder]
command=/usr/lib64/shibboleth/shibresponder
socket=unix:///opt/shibboleth/shibresponder.sock
socket_owner=root:root
socket_mode=0770
user=root
stdout_logfile=/var/log/supervisor/shibresponder.log
stderr_logfile=/var/log/supervisor/shibresponder.error.log
Cheers and Thank you very much for your help
What does supervisorctl
report regarding the status of the processes or else what do the supervisord
logs say? Has supervisorctl
's config been updated since the change and if so, are the processes started and running or do they still fail? The logs (eg the locations you have configured for stdout_logfile
and stderr_logfile
) will explain what's happening. Your config looks fine otherwise now (save the caveat above about permissions, but that won't stop the programs running).
You can also test the shibauthorizer
and shibresponder
applications by running them directly. They'll end immediately (exit code 0) but output should look like this for both:
Shibboleth initialization complete. Starting request loop.
Request loop ended.
If any errors or other output occurs, I'd start investigating that too, which could be related to compilation/packaging of Shibboleth itself.
here is my response when i am running them directly
/usr/lib64/shibboleth/shibresponder Shibboleth initialization complete. Starting request loop. Request loop ended. /usr/lib64/shibboleth/shibresponder Shibboleth initialization complete. Starting request loop. Request loop ended.
All other logs are empty
cd /var/log/supervisor/ ls -lrt total 32 -rwxrwxrwx 1 root root 0 Oct 3 07:37 shibauthorizer.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibauthorizer.error.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.error.log -rwxrwxrwx 1 root root 27095 Oct 5 13:42 supervisord.log
I have flushed my logs and restarted supervisord, but all other logs are always empty
ls -lrt total 4 -rwxrwxrwx 1 root root 0 Oct 3 07:37 shibauthorizer.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibauthorizer.error.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.log -rwxrwxrwx 1 root root 0 Oct 3 07:38 shibresponder.error.log -rwxrwxrwx 1 root root 541 Oct 7 00:19 supervisord.log cat supervisord.log 2017-10-07 00:19:30,052 CRIT received SIGTERM indicating exit request 2017-10-07 00:19:30,240 CRIT Supervisor running as root (no user in config file) 2017-10-07 00:19:30,254 INFO /var/run/supervisord/supervisor.sock:Medusa (V1.1.1.1) started at Sat Oct 7 00:19:30 2017 Hostname: <unix domain socket> Port:/var/run/supervisord/supervisor.sock 2017-10-07 00:19:30,320 CRIT Running without any HTTP authentication checking 2017-10-07 00:19:30,322 INFO daemonizing the process 2017-10-07 00:19:30,322 INFO supervisord started with pid 21726
And the socket didnt exists
ls -lrt /opt/shibboleth/shibauthorizer.sock ls: cannot access /opt/shibboleth/shibauthorizer.sock: No such file or directory ls -lrt /opt/shibboleth/shibresponder.sock ls: cannot access /opt/shibboleth/shibresponder.sock: No such file or directory
Seems my supervisord doesnt respect the fcgi-program as the program name, that was why i changed to "program"
here is my supervisord version
[root@ip-10-0-0-33 supervisor]# yum list installed | grep super supervisor.noarch 2.1-9.el6 @epel
But the supervisord shld have fcgi-program support since very old version(article from 2010)
ref: http://sachachua.com/blog/2010/06/using-supervisord-for-nginxfastcgiphp/
no clue yet...
Ah, there's your problem. Your Supervisor version is much too old. See http://supervisord.org/configuration.html#fcgi-program-x-section-settings, but the socket and other options need Supervisord v3.0 or above.
Install that version and you'll be good to go. Unfortunately the version in EPEL is ancient so you'll have to build the RPMs yourself. You might consult https://github.com/jcu-eresearch/supervisor-rpm as a starting point (also in need of an update, but it's v3+). Otherwise, if upgrading OS is an option, Supervsiord 3.1.4 is packaged for EL7.
I've also added a note to this effect at b55eedd. Closing as resolved; message back otherwise.
Yes it is running now! i cannot imagine it is resolved so easily, sorry for your time