cmd/go: simple off-by-one example takes longer than expected to complete

Question

cmd/go: simple off-by-one example takes longer than expected to complete

Closed this issue 3 years ago · 8 comments

What version of Go are you using (`go version`)?

$ gotip version
go version devel go1.17-542e8c74e7 Fri Jun 4 15:59:32 2021 +0000 linux/amd64

What did you do?

I ported a simple example I've used to test go-fuzz in the past:

// +build gofuzz

package fuzz

func Example(data []byte) bool {
	if len(data) == 9 {
		if data[0] == 'G' && data[1] == 'O' && data[2] == 'P' && data[3] == 'H' && data[4] == 'E' && data[5] == 'R' && data[6] == 'S' && data[7] == '!' && data[8] == '!' && data[9] == '!' {
			return true
		}
	}
	return false
}

func Fuzz(data []byte) int {
	Example(data)
	return 0
}

⬇️

// +build gofuzzbeta

package fuzz

import (
	"testing"
)

// Example is a common fuzzing example that demonstrates an off-by-one/out-of-bounds 
// error which causes the program to crash. 
//
// Instead of checking that len(data) == 9 the correct code should be len(data) == 10.
func Example(data []byte) bool {
	if len(data) == 9 {
		if data[0] == 'G' && data[1] == 'O' && data[2] == 'P' && data[3] == 'H' && data[4] == 'E' && data[5] == 'R' && data[6] == 'S' && data[7] == '!' && data[8] == '!' && data[9] == '!' {
			return true
		}
	}
	return false
}

func FuzzOffByOne(f *testing.F) {
	f.Fuzz(func(t *testing.T, input []byte) {
		Example(input)
	})
}

$ gotip test -fuzz=FuzzOffByOne

What did you expect to see?

I've run the example several times before, and would generally expect a fuzzer to find a crash fairly fast, even without a corpus -- somewhere between a few seconds to a few minutes.

Using the same CPU and RAM configuration as the native fuzzer test, I was able to find it within seconds using go-fuzz:

What did you see instead?

With the new native fuzzer, it took ~24 hours to find a crash using 2GB of RAM and 1 CPU (from an n1-standard-2 instance):

Answer 1 · 2021-07-07T23:11:46.000Z

CC @katiehockman

Answer 2 · 2021-07-08T14:35:30.000Z

CC @golang/fuzzing

Answer 3 · 2021-07-09T16:47:06.000Z

This is an interesting comparison since the target expects a very specific input size. It seems plausible that go-fuzz is much more conservative about increasing the size of inputs during mutation (perhaps in relation to the initial size of the input), so when starting from an empty input you're more likely to hit an input which is 9 bytes long, whereas our mutators don't really take this into account, making it much more of a lottery.

Answer 4 · 2021-08-06T11:15:21.000Z

I have a similar example here: https://github.com/stevenjohnstone/go-beta-fuzzer-vs-libfuzzer/blob/main/fuzz.go. In this case, the number of bytes a mutator needs to "guess" is fewer so it completes in a reasonable time.

In that repo, I build a libfuzzer harness equivalent to a native fuzzer. Both use the "libfuzzer" build for instrumentation (inline 8-bit counters and cmp tracing hooks) so they have the same feedback (modulo some noise from supporting code) for executions of the function under test. I disable the cmp tracing functionality during libfuzzer runs to make sure I'm comparing apples with apples.

Initial experiments seem to indicate that libfuzzer is able to find a crasher about 100x faster than the native fuzzer for my very artificial example. I don't think this can be used as judgement on effectiveness without a lot of careful thought but I think it's interesting that we have an alternative implementation available for comparisons.

Answer 5 · 2021-11-15T18:24:49.000Z

Coming back to look at this again, the example case is a great example of something fuzzers are generally not great at without additional instrumentation strategies. In order to hit the out-of-bounds access, the fuzzer first has to generate a very particular string (GOPHERS!!). Without comparison instrumentation, or more advanced REDQUEEN-esque input-to-state comparison mapping, using just coverage instrumentation is going to take a while to hit the failing condition via brute force mutations.

I think we should probably just close this, in favor of #46507 which details more advanced strategies the fuzzer could incorporate in the future.

Answer 6 · 2021-11-15T19:02:00.000Z

@rolandshoemaker I don't think this needs any instrumentation of comparisons to work. If I build the example as

package fuzz

func FuzzLibFuzzer(data []byte) int {
	if len(data) == 9 {
		if data[0] == 'G' && data[1] == 'O' && data[2] == 'P' && data[3] == 'H' && data[4] == 'E' && data[5] == 'R' && data[6] == 'S' && data[7] == '!' && data[8] == '!' && data[9] == '!' {
			panic("found")
		}
	}
	return 0
}

and compile it for use with libfuzzer (with https://github.com/stevenjohnstone/go114-fuzz-build and gotip as the go compiler to make the comparison fair), then the fuzzer finds the crasher in a few seconds even with comparison tracing disabled:

$ ./fuzz.libfuzzer -use_cmp=0
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1335628375
INFO: 71 Extra Counters
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2	INITED ft: 3 corp: 1/1b exec/s: 0 rss: 27Mb
#817	NEW    ft: 5 corp: 2/10b lim: 11 exec/s: 0 rss: 27Mb L: 9/9 MS: 5 ChangeByte-ChangeBit-InsertRepeatedBytes-CrossOver-CrossOver-
#158588	NEW    ft: 6 corp: 3/19b lim: 1570 exec/s: 158588 rss: 27Mb L: 9/9 MS: 1 InsertRepeatedBytes-
#165640	NEW    ft: 7 corp: 4/28b lim: 1640 exec/s: 165640 rss: 27Mb L: 9/9 MS: 2 ChangeBit-ChangeBit-
#262144	pulse  ft: 7 corp: 4/28b lim: 2600 exec/s: 131072 rss: 27Mb
#272207	NEW    ft: 8 corp: 5/37b lim: 2699 exec/s: 136103 rss: 27Mb L: 9/9 MS: 2 ChangeBit-ChangeByte-
#331563	NEW    ft: 9 corp: 6/46b lim: 3282 exec/s: 165781 rss: 27Mb L: 9/9 MS: 1 ChangeBinInt-
#371205	NEW    ft: 10 corp: 7/55b lim: 3667 exec/s: 123735 rss: 27Mb L: 9/9 MS: 2 ChangeBinInt-ChangeBinInt-
#524288	pulse  ft: 10 corp: 7/55b lim: 4096 exec/s: 131072 rss: 27Mb
#564466	NEW    ft: 11 corp: 8/64b lim: 4096 exec/s: 141116 rss: 27Mb L: 9/9 MS: 1 ChangeByte-
#1048576	pulse  ft: 11 corp: 8/64b lim: 4096 exec/s: 131072 rss: 27Mb
#1346633	NEW    ft: 12 corp: 9/73b lim: 4096 exec/s: 122421 rss: 27Mb L: 9/9 MS: 2 ChangeBit-ChangeBit-
#1420584	NEW    ft: 13 corp: 10/82b lim: 4096 exec/s: 129144 rss: 27Mb L: 9/9 MS: 1 CrossOver-
panic: runtime error: index out of range [9] with length 9

goroutine 17 [running, locked to thread]:
github.com/stevenjohnstone/fuzz.FuzzLibFuzzer(...)
	github.com/stevenjohnstone/fuzz/fuzz.go:5
main.LLVMFuzzerTestOneInput(0x2ebba40, 0x9)
	./main.3023831966.go:21 +0x29f
==108767== ERROR: libFuzzer: deadly signal
==108767==WARNING: invalid path to external symbolizer!
==108767==WARNING: Failed to use and restart external symbolizer!
    #0 0x4b0694  (/home/stevie/47090/fuzz.libfuzzer+0x4b0694)
    #1 0x459a28  (/home/stevie/47090/fuzz.libfuzzer+0x459a28)
    #2 0x43da63  (/home/stevie/47090/fuzz.libfuzzer+0x43da63)
    #3 0x7ff093d5651f  (/lib/x86_64-linux-gnu/libc.so.6+0x4651f)
    #4 0x504080  (/home/stevie/47090/fuzz.libfuzzer+0x504080)

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 1 CopyPart-; base unit: 66afe1da9d562cf1c22adbde30a1f2415bb2e01f
0x47,0x4f,0x50,0x48,0x45,0x52,0x53,0x21,0x21,
GOPHERS!!
artifact_prefix='./'; Test unit written to ./crash-af006d47aa10b88943df8c1cf8545aec346e9c09
Base64: R09QSEVSUyEh

I think you hit the nail on the head in #47090 (comment): the fuzzer tries really long inputs. When I instrument with bpftrace and run the golang native fuzzer for 30 seconds, I see

@len: 
[0]                   52 |                                                    |
[1]                 1989 |                                                    |
[2, 4)              1650 |                                                    |
[4, 8)              5891 |                                                    |
[8, 16)            18321 |@                                                   |
[16, 32)           31080 |@@                                                  |
[32, 64)           58341 |@@@@                                                |
[64, 128)         107252 |@@@@@@@                                             |
[128, 256)        187963 |@@@@@@@@@@@@@                                       |
[256, 512)        284687 |@@@@@@@@@@@@@@@@@@@@                                |
[512, 1K)         438991 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                      |
[1K, 2K)          617222 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         |
[2K, 4K)          738464 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4K, 8K)          695546 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    |
[8K, 16K)         216449 |@@@@@@@@@@@@@@@                                     |
[16K, 32K)         17991 |@                                                   |

Answer 7 · 2021-11-15T20:56:58.000Z

Oh hm, now looking at the mutation code for the ~third time, I'm seeing some somewhat unexpected behavior. With a handful of architectural tweaks I can mostly match libFuzzer performance on this particular target.

I need to stare at this a little bit longer to figure out the correct solution.

Answer 8 · 2021-11-17T17:01:09.000Z

Change https://golang.org/cl/364214 mentions this issue: internal/fuzz: limit number of consecutive mutations

What version of Go are you using (go version)?

What did you do?

What did you expect to see?

What did you see instead?

What version of Go are you using (`go version`)?