exit code 11 aka (silent)segfault when backend scg is used on tty, or DISPLAY isn't set on X
temporaryrespite opened this issue · 3 comments
Because dpy
here[1] will be null when running eg. blugon -b scg -r -o
:
[1]
Line 10 in 44b908e
and there's no check & graceful exit.
The exit code for the above command is 11
, but when running scg
backend directly, on tty(rather than inside X), the segfault can be seen, eg.:
/lib/blugon/scg 1.0 0.6949030005552019 0.4310480202110507
$ /lib/blugon/scg 1.0 0.6949030005552019 0.4310480202110507
[ 1627.389262] scg[93373]: segfault at e0 ip 000055e14ea4d0dd sp 00007ffca8ec6e00 error 4 in scg[55e14ea4d000+1000]
[ 1627.390830] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 41 57 41 56 41 55 49 89 f5 41 54 55 89 fd 31 ff 53 48 83 ec 28 e8 a6 ff ff ff 48 89 c3 <48> 63 80 e0 00 00 00 48 89 df 48 c1 e0 07 48 03 83 e8 00 00 00 48
[ 1627.392435] potentially unexpected fatal signal 11.
[ 1627.394046] CPU: 1 PID: 93373 Comm: scg Kdump: loaded Tainted: G I 5.4.10-g7a02c193298e #46
[ 1627.397178] RIP: 0033:0x55e14ea4d0dd
[ 1627.398696] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 41 57 41 56 41 55 49 89 f5 41 54 55 89 fd 31 ff 53 48 83 ec 28 e8 a6 ff ff ff 48 89 c3 <48> 63 80 e0 00 00 00 48 89 df 48 c1 e0 07 48 03 83 e8 00 00 00 48
[ 1627.400266] RSP: 002b:00007ffca8ec6e00 EFLAGS: 00010202
[ 1627.401848] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000090
[ 1627.403418] RDX: 0000000000000080 RSI: 00007ffca8ec6f48 RDI: 00007ff2eb6f5090
[ 1627.404968] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[ 1627.406527] R10: 000055e14ea4c52c R11: 00007ff2eb680840 R12: 000055e14ea4d280
[ 1627.408031] R13: 00007ffca8ec6f48 R14: 0000000000000000 R15: 0000000000000000
[ 1627.409483] FS: 00007ff2eb414d80 GS: 0000000000000000
[ 1627.423791] systemd[1]: Started Process Core Dump (PID 93385/UID 0).
Segmentation fault (core dumped)
[ 1627.938066] systemd-coredump[93393]: Process 93373 (scg) of user 1000 dumped core.
[ 1627.938066]
[ 1627.938066] Stack trace of thread 93373:
[ 1627.938066] #0 0x000055e14ea4d0dd main (scg + 0x10dd)
[ 1627.938066] #1 0x00007ff2eb49e1b6 __libc_start_main (libc.so.6 + 0x271b6)
[ 1627.938066] #2 0x000055e14ea4d2ae _start (scg + 0x12ae)
[ 1627.938066]
[ 1627.948951] systemd[1]: systemd-coredump@2-93385-0.service: Succeeded.
or this can also reproduce it:
$ DISPLAY= /lib/blugon/scg 1.0 0.6949030005552019 0.4310480202110507
Segmentation fault (core dumped)
$ DISPLAY= blugon -b scg -r -o; echo "exit code:$?"
exit code:11
I didn't want to make a PR for this because i don't know how you want this handled (if ever), but the idea is something like this:
///...
#include <stdio.h> /* fprintf */
///...
Display *dpy = XOpenDisplay(NULL);
//If XOpenDisplay does not succeed, it returns NULL.
if (NULL == dpy) {
fprintf(stderr, "X is not running? or cannot open a connection to it. Is DISPLAY env var set?\n");
return 1; // exit code 1
}
You are right, that the pointer dpy
should be checked, if NULL
.
When I wrote it I just used blugon.py to handle the segfault.
The exit(11)
in blugon.py is being executed just when DISPLAY
is empty and this should occur together with XOpenDisplay(NULL);
returning a NULL
-pointer.
Still, if anyone is interested in running just scg
, it will be nicer to avoid the segfault! I will write PR for this soon and link it here.
Thank you for spotting this :)
The
exit(11)
in blugon.py is being executed just whenDISPLAY
is empty
Thank you for this info. When I created this issue, I didn't realize this is what was happening, but instead, without looking into it, I assumed that somehow python
was catching the segfault and preventing it from occurring(such as being reported by systemd on dmesg
) and instead was just simply exiting with exit code 11. The assumption was based on my experience with something(can't remember what? glib2?) running tests and preventing the segfaults from reaching dmesg, so I knew that it was possible.
EDIT: If I had used the verbose arg -V
I would've seen it:
$ DISPLAY= blugon -b scg -r -o -V; echo "exit code:$?"
DISPLAY environment variable not set
exit code:11
EDIT2: I just wanted to see what happens when DISPLAY is set(wrongly) and thus the segfault is still allowed to happen:
$ DISPLAY=1:1 blugon -b scg -r -o -V; echo "exit code:$?"
Calculated RGB Gamma values: 1.0 0.6949030005552019 0.4310480202110507
Provide current minute 282.75
Calling backend scg
glibc64:../sysdeps/posix/getaddrinfo.c:2201/getaddrinfo: scg[67804](full:'/usr/lib/blugon/scg') for user user(1000(eff:user(1000))) 1of2 attempting to resolve (requested)hostname:
1
glibc64:../sysdeps/posix/getaddrinfo.c:2201/getaddrinfo: scg[67804](full:'/usr/lib/blugon/scg') for user user(1000(eff:user(1000))) 2of2 successfully resolved requested hostname('1') which was not transformed('1') as follows:
0.0.0.1 1
[ 1183.775480] rejected:IN= OUT=net0 SRC=192.168.0.14 DST=0.0.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=9893 DF PROTO=TCP SPT=60802 DPT=6001 WINDOW=65535 RES=0x00 SYN URGP=0 UID=1000 GID=1000
[ 1183.775586] scg[67804]: segfault at e0 ip 000056441a2430dd sp 00007ffc26141770 error 4 in scg[56441a243000+1000]
[ 1183.775625] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 41 57 41 56 41 55 49 89 f5 41 54 55 89 fd 31 ff 53 48 83 ec 28 e8 a6 ff ff ff 48 89 c3 <48> 63 80 e0 00 00 00 48 89 df 48 c1 e0 07 48 03 83 e8 00 00 00 48
[ 1183.775686] potentially unexpected fatal signal 11.
[ 1183.775708] CPU: 3 PID: 67804 Comm: scg Kdump: loaded Tainted: G I 5.4.14-g0fce94b45b53 #47
[ 1183.775771] RIP: 0033:0x56441a2430dd
[ 1183.775778] glibc64:../sysdeps/posix/getaddrinfo.c:2201/getaddrinfo[67804]: scg[67804](full:'/usr/lib/blugon/scg') for user user(1000(eff:user(1000))) 1of2 attempting to resolve (requested)hostname:
[ 1183.775778] ←[41m1←(B←[m
[ 1183.775787] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 41 57 41 56 41 55 49 89 f5 41 54 55 89 fd 31 ff 53 48 83 ec 28 e8 a6 ff ff ff 48 89 c3 <48> 63 80 e0 00 00 00 48 89 df 48 c1 e0 07 48 03 83 e8 00 00 00 48
[ 1183.775790] RSP: 002b:00007ffc26141770 EFLAGS: 00010202
[ 1183.778935] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000056441b83c010
[ 1183.778935] RDX: 00007f1496d41a40 RSI: 0000000000000000 RDI: 00007f1496d419e0
[ 1183.778936] RBP: 0000000000000004 R08: 000056441b83c01a R09: 0000000000000000
[ 1183.778937] R10: 0000000000000000 R11: 00007f1496d41a40 R12: 000056441a243280
[ 1183.778937] R13: 00007ffc261418b8 R14: 0000000000000000 R15: 0000000000000000
[ 1183.778938] FS: 00007f1496b19d80 GS: 0000000000000000
[ 1183.791741] systemd[1]: Started Process Core Dump (PID 67805/UID 0).
[ 1183.913768] glibc64:../sysdeps/posix/getaddrinfo.c:2201/getaddrinfo[67804]: scg[67804](full:'/usr/lib/blugon/scg') for user user(1000(eff:user(1000))) 2of2 successfully resolved requested hostname('1') which was not transformed('1') as follows:
[ 1183.913768] ←[44m0.0.0.1 1←(B←[m
Traceback (most recent call last):
File "/usr/bin/blugon", line 558, in <module>
main()
File "/usr/bin/blugon", line 548, in main
while_body(get_minute(), 0)
File "/usr/bin/blugon", line 506, in while_body
[ 1184.302263] systemd-coredump[67806]: Process 67804 (scg) of user 1000 dumped core.
[ 1184.302263]
[ 1184.302263] Stack trace of thread 67804:
[ 1184.302263] #0 0x000056441a2430dd main (scg + 0x10dd)
[ 1184.302263] #1 0x00007f1496ba31b6 __libc_start_main (libc.so.6 + 0x271b6)
[ 1184.302263] #2 0x000056441a2432ae _start (scg + 0x12ae)
[ 1184.302263]
call_backend(BACKEND, red_gamma, green_gamma, blue_gamma)
[ 1184.313755] systemd[1]: systemd-coredump@1-67805-0.service: Succeeded.
File "/usr/bin/blugon", line 449, in call_backend
call_scg(r, g, b)
File "/usr/bin/blugon", line 423, in call_scg
check_call([MAKE_INSTALL_PREFIX + '/lib/blugon/scg', str(r), str(g), str(b)])
File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/lib/blugon/scg', '1.0', '0.6949030005552019', '0.4310480202110507']' died with <Signals.SIGSEGV: 11>.
exit code:1
This is all as expected :)
When merged segfaults should be avoided.
Thank you for investigating the issue 👍