oxidecomputer/propolis

PHD: raw buffer backend doesn't properly consume characters after matching

Closed this issue · 0 comments

The raw buffer's satisfy_or_set_wait function doesn't do what its doc comment says:

/// Attempts to satisfy the wait described by `waiter` or, if the wait
/// cannot yet be satisfied, stores it to be checked again later.
///
/// A wait is satisfied if the `wait_buffer` contains the string in the
/// supplied waiter. When this happens, all of the characters preceding the
/// match are sent to the output channel in the supplied `waiter`, the
/// matching characters are removed, and the remainder of the wait buffer
/// is preserved.
///
/// If the buffer contains multiple matches, the *last* match is used to
/// satisfy the wait.
///
/// # Panics
///
/// Panics if a wait is already set (irrespective of whether the new wait
/// actually needs to be stored).
fn satisfy_or_set_wait(&mut self, waiter: OutputWaiter) {
assert!(self.waiter.is_none());
trace!(
contents = self.wait_buffer,
target = waiter.wanted,
"checking wait on raw serial buffer"
);
if let Some(idx) = self.wait_buffer.rfind(&waiter.wanted) {
let out = self.wait_buffer[..idx].to_owned();
// Because incoming bytes from Propolis may be processed on a
// separate task than the task that registered the wait, this
// can race such that the wait is satisfied just as the waiter
// times out and closes its half of the channel. There's nothing
// to be done about this, so just ignore any errors here.
let _ = waiter.preceding_tx.send(out);
} else {
self.waiter = Some(waiter);
}
}
}

This copies the buffer contents that preceded the match string so they can be returned to the caller, but doesn't actually clear the wait buffer.

This turns out not to be a problem today because our existing test cases rely on run_shell_command to send commands to the guest, and the default shell command sequence explicitly clears the back buffer of its own volition (specifically to keep the newlines it sends from appearing in command output). But new test cases that can't use run_shell_command for some reason (e.g. a guest reboot test that's not guaranteed to get another shell prompt after inputting the reboot command) are likely to run afoul of this.