fill_buffer doesn't match doc - never returns NotEnoughData so can't re-fill arbitrary
Ten0 opened this issue · 5 comments
The fill_buffer
documentation specifies:
Lines 392 to 393 in e0a5bc7
However the actual implementation is:
Lines 406 to 414 in e0a5bc7
Which silent-fails and never returns an error.
This in turn implies that code such as:
let line: SomeStructThatDerivesArbitrary = match unstructured.arbitrary() {
Ok(ok) => ok,
Err(arbitrary::Error::NotEnoughData) => {
std::mem::drop(unstructured);
random_data.clear();
let mut trng = rand::thread_rng();
random_data.extend(
(0..1000)
.map(|_| trng.sample::<u8, _>(rand::distributions::Standard)),
);
unstructured = arbitrary::Unstructured::new(&random_data);
unstructured.arbitrary()?
}
Err(e) => return Err(e.into()),
};
actually always generates the same SomeStructThatDerivesArbitrary
once the initial arbitrary::Unstructured
is empty. (Which in my case is right at the start.)
fill_buffer
should match the doc and return NotEnoughData
when too small instead of silent-fail-ing, otherwise there's a huge risk that the user will actually explore a much smaller space of arbitrary values than they expected.
I note this was initially the case, but was changed in 1d2f653.
This commit makes the assumption that fuzzers will reduce their inputs by reducing the number of input bytes, and couldn't perform reduction by turning some input bytes into zeroes - which I think would probably be the correct thing for them to do when using arbitrary
.
From that commit:
Next breaking release that we make, I think we can remove the fallibility from
theArbitrary
trait and all theUnstructured
methods. The only tricky one is
Unstructured::get_bytes
, but I think we can either loosen its contract so that
it doesn't always return a slice of lengthsize
, or remove the method
completely in favor ofUnstructured::fill_buffer
.
So I think @fitzgen intends to get rid of the fallibility here? Though I feel like this particular flavor of fallibility does not hurt mutations.
I must have forgotten to update the docs when we changed this. This is an intentional change and that commit's message describes the motivation for this trade off:
We were previously unsure whether it was better to return an error and exit the
fuzz target early, or to append "fake" data (like all zeros) when anArbitrary
implementation requested more data than the fuzzer gave us. We originally chose
to exit early, and this commit is the first step in reversing that decision.We didn't have any data to motivate either choice when we originally made the
decision to exit early. What we had was an intuition that we shouldn't
repeatedly waste time on test cases that are very similar to each other because
they are all padded with the same fake data at the end. If we exited early, our
thinking went, we would yield our time back to the fuzzer, letting it more
efficiently explore the state space.In practice, it has worked out okay, but not great. What we've been missing out
on are chance mutations that cause us to explore new code paths. A random
mutation made by the fuzzer that would otherwise take us to a new code path
happens to have an invalid UTF-8 code point or not quite enough data, and then
we return an error and exit early. If we avoided returning errors, then that
random mutation would have led us to new code paths. A corollary is that test
case reduction is more efficient as well, since the reduced input bytes are also
more likely to succeed in generating a valid instance of theArbitrary
type.In summary, exiting early yields time back to the fuzzer, giving it more chances
to try new mutations, while avoiding early exits is more forgiving to random
mutations, making them more likely to succeed at finding new code paths. This is
a trade off, and we can't have both in the limit. My local, not-super-rigorous
experiments are telling me that we made the wrong trade off originally, and that
being forgiving with mutations to find new code paths easier is the better
choice.
I'll update the docs for fill_buffer
.
This is an intentional change and that commit's message describes the motivation for this trade off
Given that I pointed the commit myself I know this ^^'
The issue was about having the answer to:
This commit makes the assumption that fuzzers will reduce their inputs by reducing the number of input bytes, and couldn't perform reduction by turning some input bytes into zeroes - which I think would probably be the correct thing for them to do when using
arbitrary
.
and
Though I feel like this particular flavor of fallibility does not hurt mutations.
and re-weighting that decision while taking into account the described use-case.
So I wouldn't call it "resolved" by just saying it's intentional ^^'
This commit makes the assumption that fuzzers will reduce their inputs by reducing the number of input bytes, and couldn't perform reduction by turning some input bytes into zeroes - which I think would probably be the correct thing for them to do when using
arbitrary
.
It isn't that we are assuming that fuzzers can't turn input bytes into zeros. Of course they can. We found that in practice, with large and complex Arbitrary
implementations like wasm-smith
, we get much better test case reduction via cargo fuzz tmin
with the "fill with zeroes" approach. That was evidence we used when weighing and choosing the trade off.
Though I feel like this particular flavor of fallibility does not hurt mutations.
The problem is that all integer Arbitrary
implementations are based on fill_buffer
. So we would have to update that code to use zero if there was an error. And that alone isn't hard, but it creates the cognitive burden of having to remember to do this dance every time we introduce a new call to this method, to make sure we do the right thing. Instead, we can do the right thing once, inside fill_buffer
, and never worry about it again.
In general, I think we should remove fill_buffer
, or make it an internal implementation detail. External users are better served by Unstructured::bytes
, where they can choose what to do with the data rather than having fill_buffer
's copy foisted upon them 100% of the time.
We found that in practice, with large and complex
Arbitrary
implementations likewasm-smith
, we get much better test case reduction
But for this case, couldn't the fuzzer, or its interface with arbitrary
be parametrized to perform reduction not by lowering the amount of bytes, but by setting them to zero? (in the case where the fuzzer itself doesn't support it, the interface with Arbitrary could enlarge the input used by arbitrary and set the rest to zeroes if that performs better)
It looks like this would also solve the problem (exact same behavior as now, not less performant), while avoiding silent fails, which notably becomes an issue in more general uses than fuzzing (e.g., the example above).
I think without these kind of changes, this enables this library to be used in a wider range of uses than fuzzing, which its name lets suppose it could be used for, while moving in this direction makes it unusable for just generating random inputs, which could be used in benchmarking tools, etc.