select relevant third-party tests to run their tests with garble on CI
mvdan opened this issue · 6 comments
For example, #190 shows us that programs using basic protobuf-generated code panic at init time. We could (and probably should) try to minimize the failure into a test case in garble, but we should also ensure that garble test ./...
in a recent version of protobuf succeeds.
I propose this short list of projects as a start:
- https://pkg.go.dev/google.golang.org/protobuf, since protos are very common, and it's a good stress test of generated code and reflection
- https://pkg.go.dev/github.com/sirupsen/logrus, a pretty common and non-trivial logging library
- https://pkg.go.dev/github.com/spf13/cobra, a common CLI tool framework, pulling other common libraries like spf13/pflag
If this works well, we can add more. The selection criteria should roughly be:
- Popularity - estimated, at this time
- Quality - modules whose tests fail at times or take too long aren't useful
- Diversity - a second CLI framework is less likely to introduce more edge cases
- Past bugs - those libraries which have surfaced garble bugs before, like protobuf
I think we could have a nested module for this, like testdata/libraries/go.mod
. A libs.go
file would import at least one package from each module, to keep go mod tidy
happy. We could then use GOPRIVATE='*' garble test all
or somesuch, to run obfuscated versions of all tests in direct and indirect dependencies.
Some notes:
- We probably want to use test's
-short
flag to save time, at least as a start. - We should check that
go test all
succeeds first, to reduce confusion if one of the tests is just broken. - Are we interested in running the tests of indirect dependencies? Probably yes, as test failures in those could break the direct dependencies.
- Ideally, the CI check would run everywhere, including pushes to master and PRs. In practice, we'll see how fast or slow it ends up being. Maybe we end up using a subset of the tests, or
-short
, for PRs.
I forgot to say what is perhaps obvious; garble test std
should be at the top of the list, because standard library packages are by far the most commonly used.
We already kind of cover this in goprivate.txt
, since we do garble build std
, but that's not running any tests.
Here's a complication: by design, some tests will fail when obfuscated:
$ wgo1 garble test -short runtime/debug
allocating objects
starting gc
starting dump
done dump
--- FAIL: TestStack (0.00s)
stack_test.go:63: expected "src/runtime/debug/stack.go" in "\truntime/debug/stack.go:24 +0x9f"
stack_test.go:63: expected "src/runtime/debug/stack_test.go" in "\truntime/debug_test/stack_test.go:16"
stack_test.go:63: expected "src/runtime/debug/stack_test.go" in "\truntime/debug_test/stack_test.go:19"
stack_test.go:63: expected "src/runtime/debug/stack_test.go" in "\truntime/debug_test/stack_test.go:41 +0x36"
stack_test.go:63: expected "src/testing/testing.go" in "\ttesting/testing.go:1194 +0xef"
FAIL
FAIL runtime/debug 0.026s
Unfortunately there isn't an opposite to -run
. It wouldn't really help either, as it doesn't allow selecting test names from specific packages.
Another option might be to use garble test -json
, process the OK/FAIL results, and have a table of known-to-break tests which we expect to fail instead of succeed.
As a first goal, this should check that the third party packages build. #310 is a good example of when we can't even build due to garble bugs.
Another important point is that, for these relevant third-party codebases, we should try building them with a variety of GOOS/GOARCH targets that CI doesn't directly cover. For example:
- plan9/386 (uncommon OS, plus 32-bit arch)
- js/wasm (fairly different platform overall)
- freebsd/arm64 (relatively common OS and arch, but not covered by CI)
For instance, the first of those three would have caught #417 just by using GOPRIVATE=* garble build std
.