nix-rust/nix

Make the tests that failed to acquire locks panic explicitly

SteveLauC opened this issue · 4 comments

We have some process-wide Mutex locks in test.rs to ensure tests that need to modify global states will be executed sequentially:

nix/test/test.rs

Lines 59 to 74 in a7db481

/// Any test that creates child processes must grab this mutex, regardless
/// of what it does with those children.
pub static FORK_MTX: std::sync::Mutex<()> = std::sync::Mutex::new(());
/// Any test that changes the process's current working directory must grab
/// the RwLock exclusively. Any process that cares about the current
/// working directory must grab it shared.
pub static CWD_LOCK: RwLock<()> = RwLock::new(());
/// Any test that changes the process's supplementary groups must grab this
/// mutex
pub static GROUPS_MTX: Mutex<()> = Mutex::new(());
/// Any tests that loads or unloads kernel modules must grab this mutex
pub static KMOD_MTX: Mutex<()> = Mutex::new(());
/// Any test that calls ptsname(3) must grab this mutex.
pub static PTSNAME_MTX: Mutex<()> = Mutex::new(());
/// Any test that alters signal handling must grab this mutex.
pub static SIGNAL_MTX: Mutex<()> = Mutex::new(());

This should work well if the corresponding test succeeds, but once a test holding the lock failed, the thread running that test panicked, the lock got poisoned, then other tests (threads) cannot acquire the lock, and since we don't check if the lock acquisition is successful in our tests, under such scenario, those global locks won't work at all.

let _m = crate::FORK_MTX.lock();

Given that those global locks do not work under the above case, we can get some unexpected results from our test, e.g., see these 2 failures here, they want to wait for the child process created by themselves, but got the children of others:

---- test_pty::test_forkpty stdout ----
thread 'test_pty::test_forkpty' panicked at 'assertion failed: `(left == right)`
  left: `Signaled(Pid(4405), SIGTERM, false)`,
 right: `Signaled(Pid(4628), SIGTERM, false)`', test/test_pty.rs:273:13

---- test_unistd::test_wait stdout ----
thread 'test_unistd::test_wait' panicked at 'assertion failed: `(left == right)`
  left: `Ok(Signaled(Pid(4628), SIGTERM, false))`,
 right: `Ok(Exited(Pid(4733), 0))`', test/test_unistd.rs:112:13

In my mind, I am thinking about calling .unwrap() on lock() so that the tests that cannot acquire the lock will panic explicitly, which would make things clearer, though I cannot say it is generally a better approach. cc @asomers

Well, I just realized that we are using std::sync::Mutex ONLY for FORK_MTX, so the problem described in this issue only applies to it.

Another update, found that we switched to parking_lot just for the feature of no poison in #1599, and in #2107, we reverted that change for FORK_MTX, is this a mistake or it was done on purpose?

Another update, found that we switched to parking_lot just for the feature of no poison in #1599, and in #2107, we reverted that change for FORK_MTX, is this a mistake or it was done on purpose?

No, I think it was accidental to switch from parking_lot back to std in #2107 .

Ok, I will change it back tomorrow