"EOF reached" propagates to "Illegal State Exception"
sohah opened this issue · 4 comments
Hi,
Thanks for building and maintaining such a useful tool! I have two quick points:
- I think there is a slightly undesirable behavior when the
ArbitraryLengthStringGenerator
exceeds theMAX_INPUT_SIZE
, which results in JQF throwingIllegal State Exception
instead ofEOF Exception
.
More precisely, this happens becauseStreamBackedRandom
throws an exception here, yet to be caught inInputStreamGenerator
here. This results in JQF resuming execution only to crash when the state is corrupted here. This behaviour is a bit unexpected, and would be better to just crash on the first exception, as it is indeed the reason for the problem.
I am attaching the code below to allow for replication of the behavior.
- Also, I observed that JQF includes some implementation for
ExecutionIndexingGuidance
, but it is not turned on by default. However, it sounds to be more accurate than Zest's Guidance with just LinearInput, any reason why this is the case? Was there any performance issues using theExecutionIndexingGuidance
when compared with Zest's Guidance?
Thanks!
@RunWith(JQF.class)
public class HelloJQFTest {
@Fuzz
public void fuzzHelloJQF(@From(ObjGenerator.class) List<String> list) throws IllegalArgumentException, IOException {
(new HelloJQF()).handleRequest(list);
}
}
public class ObjGenerator extends Generator<List> {
public ObjGenerator() {
super(List.class);
}
@Override
public ArrayList<String> generate(SourceOfRandomness r, GenerationStatus status) {
ArrayList<String> outputs = new ArrayList<>();
while (r.nextBoolean()) {
outputs.add(new ArbitraryLengthStringGenerator().generate(r, status));
}
return outputs;
}
}
public class HelloJQF {
public void handleRequest(List<String> obj) throws IOException {
for(String str : obj){
System.out.println("executing the generated object");
}
}
}
Hi @sohah. Thanks for the report and detailed repro with points to code snippets from JQF.
- I think the main issue is here is that your generator is creating multiple instances of
ArbitraryLengthStringGenerator
. The generators based onInputStream
s essentially consume every byte of "input" that the Guidance has available until EOF. These generators are expected to be used as a single instance at the top level (i.e., in a@From
annotation in a single-argument@Fuzz
method) for AFL-style fuzzing, because it guarantees that the input files that JQF saves in thefuzz-results/corpus
directory actually represent the ENTIRE input byte-for-byte. These generators cannot be easily composed, e.g. to generate aList<String>
as you have in your example. You are bound to get anIllegalStateException
if you ask for more bytes after reaching EOF. Perhaps the error reporting can be improved, though, and I'm open to suggestions on how to handle this case. ForList<String>
, I would insist using the regularStringGenerator
or some derivation of it (e.g. we have aDictionaryBackedStringGenerator
in the examples module). - Re:
ExecutionIndexingGuidance
: it is an experimental implementation that sort-of works but has not yet provided us with superior results, mainly because of the run-time overhead of tracking execution indexes during fuzzing. The guidance is not super well documented and no research results have been published so far. Use at your own risk, if you are able to make sense of the code.
Thanks, @rohanpadhye for the reply. I have two points to make regarding our discussion on the InputStream-based generators:
-
If I understand the code correctly, the
InputStreamGenerator
, reads lazily, i.e., when aread()
, or areadAll()
is invoked on the stream for as many bytes as needed. Thus, it isn't really filling all theLinearInput
, but rather it does so based on a controlled read. If one invokesreadAll()
, it will indeed fill all the available space in theLinearInput
and EOF will be hit, but if we only do a couple of reads, usingread(byte)
(here), then this behavior is avoidable. I think this is the intention behind theArbitraryLengthStringGenerator
, which is bounded by themaxSize
of 4096, and so if I make theMAX_INPUT_SIZE
really large, or dually, make thatmaxSize
smaller, then in principle, we can compose multipleArbitraryLengthStringGenerator
generators together without hitting the EOF. Am I making any false assumptions here? -
I do not understand why would we want to catch the EOF exception in the
InputStreamGenerator
(here). It seems to me that if we hit that, then it is safer to just not continue, after all we are crashing anyway later on because ofIllegalStateException
. The fix for me would be to remove that catch statement, but perhaps this catch statement needed for some other reason that I am not aware of.
- The code snippet you linked reads all bytes up to the max buffer size. The intention is to consume the rest of the stream in its entirety, with a fixed upper bound to prevent fuzzing from getting slowed down by ridiculously large inputs. Perhaps the 4096 number should be configurable instead of hard-coded, but that is the intention of the bound. No more bytes should be consumed after
maxSize
. TheZestGuidance
has a larger bound forMAX_INPUT_SIZE
because more bytes may be needed for other types of generators which use some random bytes for control decisions, not just the data to generate (as in strings). However, theInputStreamGenerator
is most often used withAFLGuidance
, where the guidance only produces fixed-size inputs (and not extensible streams likeLinearInput
in Zest). - We need a way for the InputStream to signal EOF to its reader, which is done by returning a value of -1 (as per
java.io.InputStream
API). Let's say the fuzzer generates only a string of 10 bytes, which is much smaller than the MAX_INPUT_SIZE or any other bound, then when JQF attempts to get a new random byte from theSourceOfRandomness
it will throw anIllegalStateException
(because the randomness source doesn't know about which generator is being used; all it does is signal no more random bytes are available), and we want to convert this to an EOF from the point of view of the InputStream consumer (since this consumer does not know about the underlying tricks implemented by JQF). Note that all of this is most often applied when fuzzing withAFLGuidance
, since AFL generates only fixed-size input files (which JQF will use to back theStreamBackedRandom
source). You may not hit this condition withZest
, which can extend itsLinearInput
on demand.
@rohanpadhye , thank you for elaborating on the assumptions in building the InputStreamGenerator
and its subclasses, and the need for -1 return value. For now, I am no longer using these generators with ZestGuidance
. However, as far as I can see besides the above misleading message, and though they weren't built for that purpose, they seem to be composing just fine when used with ZestGuidance
, if MAX_INPUT_SIZE
is sufficiently larger than maxSize
.
I appreciate your help, I'm closing this issue now.