dun/munge

Tests sometimes fail on a diskless NFS-mounted node

Closed this issue · 4 comments

aprsa commented

I'm running make check on a diskless node that pxe-boots from NFS, and I am running into intermittent test failures:

rm: cannot remove '/home/andrej/system/munge-0.5.14/t/trash-directory.0110-munged-origin-addr/log-23162/.nfs00000000052c05610000021c': Device or resource busy
FATAL: Cannot prepare test area
ERROR: 0110-munged-origin-addr.t - missing test plan
ERROR: 0110-munged-origin-addr.t - exited with status 1

I've had 8 tests fail with the same output. They seem perfectly harmless and I'm guessing it's a race condition between the nfs lock and rm. I figured I'd report it still in the hope it helps.

dun commented

Thanks for reporting this. I suspect you're right about it being a race condition between nfs lockfiles and rm. I've worked around these types of errors in the past by moving test suite output files to /tmp.

SHARNESS_TRASH_DIRECTORY is defined in t/sharness.sh and becomes the HOME directory from which tests are run. You can't redefine it, but you can specify root which is prepended to it. Assuming /tmp is some non-nfs local filesystem, try running something like make check root=/tmp/munge-$$ and see if that helps.

I've been meaning to write a wiki page documenting the test suite and the variables that influence its behavior. I'll move that up on my list.

aprsa commented

Nice!

============================================================================
Testsuite summary for MUNGE 0.5.14
============================================================================
# TOTAL: 350
# PASS:  321
# SKIP:  25
# XFAIL: 4
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================

Thanks! :)

dun commented

Yay! I'll make sure to document this in the wiki.

dun commented

Now documented in the wiki.