Question: Handlers, volatile and visibility
bturner opened this issue · 5 comments
Context:
In the README for this project, there are a couple handler examples that both follow the same pattern:
- Assign a local (to the handler)
process
field with theNuProces
passed toonStart
- Use the
process
inonStdout
When using NuProcessBuilder.start
, it's guaranteed (as far as I can tell) that onStart
and onStdout
will be called on different threads. That implies it may be necessary to mark the process
field in the handler volatile
to ensure the value assigned in onStart
is visible on onStdout
. However, none of the examples (or test handlers) do so.
Question:
What sorts of visibility assurances come with handlers? What enforces them? The Process*
types that run the I/O loops for processes utilize various volatile
fields, as well as some java.util.concurrent
types. Are those usages enough to ensure sufficient memory barriers are in place "around" handler callbacks that handlers don't need to use volatile
explicitly?
I'm just trying to get a clear sense of how much the thread mismatch between onPreStart
/onStart
, which are always called on the start()
ing thread, and the other on*
methods, which (aside from onExit
) are always called on a pump thread, needs to be accounted for by handler implementations. onExit
seems to be somewhat unique in that, at least how I read the code, in certain cases it's possible for it to be called on the onPreStart
/onStart
thread (e.g. if starting a process fails) and in others on the pump thread.
On a related note, I've found that the delivery order for onStart
vs. onStdout
/onStderr
/onStdinReady
can race, when using NuProcessBuilder.start
. Because the process is registered with the pump thread before onStart
is called, it's possible for the pump thread to deliver its first callback before onStart
is called. One wouldn't expect that to happen, given the "distance", in terms of operations, between the pump thread registration and a callback, but I've conclusively seen it happen any number of times in testing. I was able to reproduce it pretty readily on a 2-core Linux VM.
This means, visibility aside, if anything in onStd*
depends on work done in onStart
(like getting access to the NuProcess
to call writeStdin
, for example), you can get random failures due to the onStd*
callback happening before onStart
(which is exactly how I found the problem).
(Note that onStart
racing isn't possible when using the new NuProcessBuilder.run
, because onStart
is always called before the thread enters the I/O processing loop.)
If you need the NuProcess
instance reference before any possible call to onStdout/onStderr/onStdinReady
, you need to use onPreStart
.
The JavaDoc for NuProcessHandler
is explicit in both cases.
onStart
Note that this method is called at some point after the process is spawned. It is possible for other methods (even {@link #onExit(int)}) to be called first. If you need a guarantee that no other methods will be called first, use {@link #onPreStart(NuProcess)} instead.
onPreStart
Unlike the {@link #onStart(NuProcess)} method, this method is invoked before the process is spawned, and is guaranteed to be invoked before any other methods are called.
I'll readily admit I overlooked the documentation. Apologies. It seems like the README
should perhaps be changed to set the nuProcess
field in onPreStart
instead of onStart
, though. While the example code may be rendered "safe" by the specific behaviors of cat
, developers are going to see that as an example of how to write a handler and end up with an implementation that can race, unless they also read the Javadocs to know to avoid it.
That said, it feels unnecessary that it work that way. onPreStart
and onStart
are not semantically the same, since the former is always called--even if the process fails to start--and there are tasks that you only want to do once that logically make sense to only do when you know the process has actually started. Recording the NuProcess
instance was only a single, specific example of something that a handler could opt to do in onStart
that, with the current handling, can cause problems. It's by no means the only such operation. Handlers essentially can't set any state in onStart
if that state is required in an onStd*
method, unless they manage their own synchronization.
For the moment, we've done exactly that, and implemented our own handling that eliminates this race, to ensure developers writing process handlers in Bitbucket Server observe reliable delivery where onStart
is always called before any I/O operations.
Setting that aside, though, my bigger questions remain and are not addressed in the Javadocs: What sort of visibility and "happens-before" guarantees do NuProcessHandler
s have? Since onPreStart
and onStart
are run on the starting thread and the rest are run on the pump thread, do fields need to be volatile
, or otherwise impose their own visibility/"happens-before" rules, or does the way NuProcess is implemented elsewhere ensure them already? And, if this is known, can it be included in the documentation? The README
shows a non-volatile
private NuProcess nuProcess;
being initialized in onStart
and used in onStdout
, which implies the field doesn't need to be volatile
for that assignment to be visible across threads.
Thanks for your help!
I agree re: the README, the example should probably use onPreStart
. onPreStart
was added perhaps a year after the first release.
I would say do not rely on any memory fences that may or may not exist in the library, as it is always possible they may change. I would recommend using something like AtomicReference
if access to the instance is required in the handler callback methods. AtomicReference
may prove more performant that volatile
, but testing would be required to know for sure.
I agree with the statement:
Handlers essentially can't set any state in onStart if that state is required in an onStd* method, unless they manage their own synchronization.
But perhaps, unlike you, I do not see this as a problem. All other synchronization is up to the user.