1.2.1 FTBFS on mips (tests failures)
onlyjob opened this issue · 5 comments
1.2.1 introduced regression(s) causing some tests to fail on all mips architectures:
https://buildd.debian.org/status/package.php?p=dumb-init&suite=sid
https://buildd.debian.org/status/fetch.php?pkg=dumb-init&arch=mips&ver=1.2.1-1&stamp=1516150508&raw=0
https://buildd.debian.org/status/fetch.php?pkg=dumb-init&arch=mips64el&ver=1.2.1-1&stamp=1516150192&raw=0
https://buildd.debian.org/status/fetch.php?pkg=dumb-init&arch=mipsel&ver=1.2.1-1&stamp=1516150253&raw=0
Thanks for reporting this! My guess is that this is a test flake (or similar) since there's barely any code changes between 1.2.0 and 1.2.1 (just a one-line bugfix), but it's hard to say for sure. Unfortunately I'm not really sure how to test this without access to a mips machine (looks like maybe QEMU could work?).
Is it possible to rerun the build on mips* to check if it's failing consistently or just flaky?
I don't have access to any mips hardware... I think tests fail consistently as all mips architectures are affected (not just one or any two)...
1.2.2 is even worse, it fails on armel and armhf as well: https://buildd.debian.org/status/package.php?p=dumb-init&suite=sid
That's really odd... I was almost certain 1.2.2 would improve things since it includes #174 which was causing some test flakiness for us, especially in some environments like Travis.
Linking in a specific build log: https://buildd.debian.org/status/fetch.php?pkg=dumb-init&arch=armel&ver=1.2.2-1&stamp=1548572571&raw=0 (not sure if these logs are kept forever, so I've also backed it up here).
The captured stderr for the failed test is kind of interesting:
[dumb-init] Running in debug mode.
[dumb-init] Unable to detach from controlling tty (errno=25 Inappropriate ioctl for device).
[dumb-init] Child spawned with PID 9146.
[dumb-init] Unable to attach to controlling tty (errno=25 Inappropriate ioctl for device).
[dumb-init] setsid complete.
[dumb-init] Received signal 17.
[dumb-init] A child with PID 9146 exited with exit status 0.
[dumb-init] Forwarded signal 15 to children.
[dumb-init] Child exited with status 0. Goodbye.
Both of the tracebacks look like this:
# read a line from print_signals, figure out its pid
line = proc.stdout.readline()
match = re.match(b'ready \(pid: ([0-9]+)\)\n', line)
> assert match, line
E AssertionError:
E assert None
Weird that there's an empty line that gets read, I wonder where that comes from...
Ultimately I don't know how we're going to try to fix this without being able to reproduce it locally, there's just not enough information in the log. Maybe I can try to rig something up with QEMU so that we can test on these architectures.
It looks like dumb-init is building everywhere now:
https://buildd.debian.org/status/package.php?p=dumb-init&suite=unstable
I suspect this patch helped: https://sources.debian.org/src/dumb-init/1.2.2-1.1/debian/patches/increase-test-sleep-time.patch/
(We also applied that here in 0708f27 so that patch can be dropped on the Debian side after the next release.)
Closing this out as I think this is fixed, let me know if I'm misunderstanding!