proposal: testing: add -json flag for json results

Question

proposal: testing: add -json flag for json results

rsc opened this issue 13 years ago · 125 comments

The idea is to make the test results available in JSON form so that other tools like
build servers can read the results easily.

c9s commented 10 years ago

+1

ryanwalls commented 9 years ago

+1

Answer 1 · 2012-03-12T07:01:18.000Z

Comment 1:

Many build servers also like the JUnit XML format. Might be worth it to do both. We've
had good success with running go test -v and translating to the JUnit format (it even
has a place to put stdout and stderr).

Answer 2 · 2012-09-12T21:41:15.000Z

Comment 2:

Labels changed: added go1.1.

Answer 3 · 2012-12-10T03:12:34.000Z

Comment 3:

Let's do this. Plan of attack:
Structs for output:
type Result struct {
    Tests []TestResult
    Benchmarks []BenchmarkResult
    About struct {
        Version string
        OS string
        Arch string
        Hostname string
        StartTime time.Time
        EndTime time.Time
    }
}
type BenchmarkResult +=
    Package string
    Name string
type TestResult struct {
    Package string
    Name string
    Pass bool
    Output string
    Duration time.Duration
}
Add flag -test.package=math/rand (import path of package), for use in setting Package in
TestResult and BenchmarkResult.
Add -test.json (bool flag) to generate json instead of standard output.
Generated json should be indented.

Answer 4 · 2012-12-10T04:02:51.000Z

Comment 4:

Labels changed: added size-l.

Answer 5 · 2012-12-10T04:04:02.000Z

Comment 5:

Labels changed: added suggested.

Answer 6 · 2012-12-10T07:20:04.000Z

Comment 6:

Any interest in generalising this slightly to allow a JUnit format to be contributed
later?
E.g.
-test.output=std (default)
-test.output=json
-test.output=junitxml

Answer 7 · 2012-12-11T15:37:39.000Z

Comment 7:

I would like to make sure we have the information JUnit needs, but I'm
not interested in generating the XML directly. I'd rather that
separate tools took care of externally-defined formats.
Russ

Answer 8 · 2012-12-12T05:31:03.000Z

Comment 8:

Looking over the junit stuff:
http://pzolee.blogs.balabit.com/2012/11/jenkins-vs-junit-xml-format/
https://svn.jenkins-ci.org/trunk/hudson/dtkit/dtkit-format/dtkit-junit-model/src/main/resources/com/thalesgroup/dtkit/junit/model/xsd/junit-4.xsd
Here's a few notes:
1. JUnit supports separate stdout and stderr, although we've found that this sometimes
makes it more difficult to understand what happened, so Output is probably fine.
2. JUnit distinguishes between errors and failures. Failures are probably t.* stuff and
errors are panics, but we write a lot of tests that simply panic instead of passing t
around everywhere, so maybe the distinction isn't important.
So I think the model above is probably fine.

Answer 9 · 2012-12-12T22:47:48.000Z

Comment 9:

wrote https://bitbucket.org/tebeka/go2xunit. Willl gladly rewrite to work with JSON
output (specially in parallel testing).

Answer 10 · 2013-01-06T22:56:40.000Z

Comment 10:

Russ, what is the purpose of -test.package? Shouldn't 'go test' pass the import path of
the package under test to package testing? Would {Benchmark,Test}Result.Package be blank
when -test.package isn't set?

Answer 11 · 2013-01-07T00:23:45.000Z

Comment 11:

The idea is that 'go test' would specify -test.package on the command line when running
the test.
However, it could also compile it in, by adding Package string fields to
testing.InternalTest and testing.InternalBenchmark. That's probably better than the
command-line option.

Answer 12 · 2013-01-07T09:50:51.000Z

Comment 12:

I've got the beginning sketches of an implementation at
https://golang.org/cl/7061048/. I'm wondering, though, what the interaction of
-test.json with cmd/go/test should be.
- If we print out summary info (e.g., ok    archive/tar 0.006s), then the output of 'go
test' is not valid json -- you'd have to pre-process it a bit.
- Ditto for multiple packages, if we just print out the results sequentially for each
package.
But, the flag is called -test.json, so 'go test' doesn't directly handle it, right?
Similarly, what should happen if -test.v and -test.json are both set? I'd suggest that
-test.json disable chattiness because the output is part of the json result anyway.

Answer 13 · 2013-01-09T18:50:30.000Z

Comment 13:

In json mode the testing framework should not print about starting and
stopping each test. It cannot clear test.v, though, because tests
might use testing.Verbose() (which looks at test.v) to decide how much
log output to generate, and we do want to generate that output (and
save it into the JSON).
The go command will need to know about the json flag regardless, in
order to turn -json into -test.json. When in that mode, it should
synthesize its own Result built containing all the TestResults and
BenchmarkResults from the individual packages. The ability to do this
is the reason that Package appears in each TestResult/BenchmarkResult
instead of in the top-level Result.
Russ

Answer 14 · 2013-03-11T20:12:05.000Z

Comment 14:

I spent a while looking into this. I think it's too complex to get into Go 1.1. The hard
part is making sure that every possible case ends up printing json and not text intended
for humans. 
First there is the problem that all test output needs to be caught, even output
generated with fmt.Printf or the builtin println or panic traces, and even if the test
binary exits unexpectedly. The only truly feasible way to do this is to make the go
command interpret the test binary output and generate the JSON from that interpretation. 
I started to try to make that a little easier by adding a -test.outputprefix flag to
specify a prefix string that the test binary will put at the beginning of each line that
it generates, so that the structure of the test output can be seen more readily. That's
in CL 7678044.
But then the next problem is making the go command parse that output and generate JSON.
And the go command must also catch all its own output and convert it to JSON
equivalents, not just for running tests but also for build failures and the like. That's
a bigger task, and at some point it starts to seem like maybe we should make the go
command just have a little bit of framing and have a separate program whose only job is
to invoke the go command (or, perhaps, also a test binary directly) and parse the output
and turn it into JSON.
This is too big a change for what time we have remaining for Go 1.1. Perhaps we can
figure out what the story is by Go 1.2. For now, an external tool that parses the go
command output (and therefore also the test binary output, if using go test -v) seems
like the best choice. And then once that exists we can think about small framing changes
to make the parsing less confusable.

Labels changed: added go1.2, removed go1.1.

Answer 15 · 2013-03-12T20:18:48.000Z

Comment 15:

[The time for maybe has passed.]

Answer 16 · 2013-03-28T18:59:55.000Z

Comment 16 by bgarcia@google.com:

What about defining an interface:
type TestReporter interface {
    Report(result Result)
}
(or something more involved that can obtain test results as they finish).
Add a testing.SetReporter(reporter TestReporter) function to set it.
Have a default implementation that outputs text to stdout.
Allow users to pass in alternate implementations that can output alternate formats, as
well as save the output to a file, database, etc.

Answer 17 · 2013-07-30T17:24:05.000Z

Comment 17:

Labels changed: added go1.3maybe, removed go1.2.

Answer 18 · 2013-08-12T18:11:22.000Z

Comment 18 by smyrman:

What about letting the input parameter for ``-json`` be a filename and not a boolean?
Let the json format contain a boolean ``Pass`` as well as a log only containing
information logged into the T.Log & B.Log functions.
Then let the command output contain what it normally contains (and let -test.v do what
it normally does).

Answer 19 · 2013-08-13T18:02:32.000Z

Comment 19:

I still believe an external tool to convert ordinary `go test` output to JSON is the
best bet.

Answer 20 · 2013-08-13T18:39:56.000Z

Comment 20:

One problem I see with the "parsing `go test` output" solution is that there are 3rd
party `go test` enhancements, which tend to change the output format.
For example, github.com/stretchr/testify/assert uses \r to overwrite t.Errorf's
"file:line: message" output with its own format and, more importantly, caller
information.
Such uses would break the parser, while they'd still work fine if the json generation
was integrated into `testing`.
To me, it would make sense to store testing output directly in *T's methods. stdout and
stderr could be captured separately. That way, `testing` can provide a core set of
functionality, as it does today, allowing 3rd party enhancements to build on top of
those, without having to reinvent certain parts.

Answer 21 · 2013-08-13T18:47:44.000Z

Comment 21:

Given that \r doesn't actually overwrite things except when displayed in
simulations of 1970s terminals, I don't see how a '\r' would break the
parser any more than 'X'.

Answer 22 · 2013-08-20T23:35:28.000Z

Comment 22:

Labels changed: removed go1.3maybe.

Answer 23 · 2013-09-18T23:47:41.000Z

Comment 23:

The problem with \r is twofold. 
1) If the standard format is "file:line: message", and a custom format turns that into
"file:line: \rfile:otherline: message", parsing will give me a message that 1) includes
the \r 2) more information than should be in the message
2) The entire point behind the \r hack was to change the line number because otherwise
the test library, with its `assert` functions, would show up as the source of the
failing test. This information would be lost to the parser.

Answer 24 · 2013-11-27T18:50:30.000Z

Comment 24:

Labels changed: added go1.3maybe.

Answer 25 · 2013-12-04T01:27:52.000Z

Comment 25:

Labels changed: added release-none, removed go1.3maybe.

Answer 26 · 2013-12-04T01:49:19.000Z

Comment 26:

Labels changed: added repo-main.

Answer 27 · 2014-02-19T03:25:08.000Z

Comment 27:

Python's nose (https://nose.readthedocs.org/en/latest/) captures the stdout of the tests
and emit it when the test ends (you can override with --nocapture flag). We might
consider this as an option. If the consumer of the output is a machine (like Jenkins),
it's probably a good approach.

Answer 28 · 2014-04-05T19:21:12.000Z

Comment 28 by beatgammit:

I agree with #7 and like the interface idea in comment #16. This makes it more-or-less a
dup of 3343.
I'm writing a generalized test result saving/reporting package based on TAP-Y/J as
defined by Ruby's tapout
(https://github.com/rubyworks/tapout/wiki/TAP-Y-J-Specification). The specification
isn't well written, but it does optionally allow for a lot of useful information.
A short list of useful information that I would like to see the testing package support:
- line numbers and stack-traces of failures
- metadata (e.g. TODO means the test failure doesn't indicate a general failure, good
for TDD); not yet supported by testing.T, but would be nice
- skipped tests w/reason
- elapsed time (can be calculated if tests are run in series)
- test coverage
It would be nice to have a formal notion of test suites, test cases and tests, but for
now packages (or perhaps files?) and functions can be used as test cases and tests
respectively. TAP-Y/J (and perhaps JUnit as well) has a notion of embedding test cases
within other test cases, but I don't think this is useful enough to complicate the
testing package.

Answer 29 · 2014-04-06T11:09:34.000Z

Comment 29 by smyrman:

Plus 1 for comment #28 /#16.

Answer 30 · 2014-09-02T23:05:34.000Z

Comment 30 by gtrevg:

This would definitely be a useful feature.  In addition, having a specified format for
the benchmark data would be good as well.  What I'm trying to put together is a system
where not only do I have historical information from build to build on unit tests &
coverage data, but also information on how the performance of my code has gotten better
or worse over time.  That way, we know ahead of time if some new feature/bug fix impacts
performance and we can go through a new round of capacity planning if necessary.
Thanks

Answer 31 · 2015-01-04T17:46:55.000Z

+1 for interfaces for the reporters.

Recommend breaking out the individual tests interface (in comment 16) from things like coverage (which would report the coverage for a single file) could use init to insert reporters into the testing framework.

Answer 32 · 2015-02-05T19:27:55.000Z

https://groups.google.com/forum/#!topic/golang-nuts/nFpL8YO4R9s is a related discussion on golang-nuts ("Jenkins plugin for Go test coverage format").

This feature is important to integrate Go builds into enterprise build pipelines and make it easier to promote Go in corporate environment.

Answer 33 · 2015-02-05T20:43:24.000Z

Russ, is there anything that interested community members can help with on this issue (beyond patiently waiting 😄 )?

Answer 34 · 2015-02-05T22:29:45.000Z

It's an open source project with hundreds of contributors. You can simply write the code and contribute it. See http://golang.org/doc/contribute.html .

Answer 35 · 2015-02-28T16:45:56.000Z

For what it's worth, I've been turning go test output into JSON for some time now.

https://github.com/smartystreets/goconvey/tree/master/web/server/parser

My use case was to create a web server that scans for changes in *.go files and runs tests, reporting them to a web client in a JSON payload.

As @rsc has already mentioned, there are lots of things to consider like arbitrary output, build failures, etc... There are issues with my implementation but I thought I'd reference what I've done in case it's helpful.

Answer 36 · 2015-10-05T19:26:19.000Z

Currently the the test results are streamed, so a tool parsing them is able to process/display them as soon as they are available. Loosing streaming would be unfortunate, so I'd advice against dumping one Result JSON on completion.

In the related, mutually-exclusive #12826 proposal (-format flag) @minux suggested to emit json for each test/benchmark result, and this is likely the only option (assuming -json) because we don't want to loose results if there is panic/crash.

Answer 37 · 2015-10-05T19:47:20.000Z

Assuming preserving streaming is important, here is modified Result struct:

type Result struct {
    Package string
    Start *struct {
        Version string
        OS string
        Arch string
        Hostname string
        Time time.Time
    }
    Test *TestResult
    Benchmark *BenchmarkResult
    End *struct {
        Time time.Time
        Pass bool
    }
}

One of Start, Test, Benchmark, End fields must be present. go test can emit one Result in JSON format for each piece of information it acquires. In the beginning it emits Result with Start field filled. Then one Result for each test/benchmark with Test or Benchmark field respectively.

Answer 38 · 2015-10-06T01:18:07.000Z

Why does the Start struct include Version, OS, Arch, etc?

I'd expect

type Result struct {
    Package   string
    Start     time.Time `json:",omitempty"`
    End       time.Time        `json:",omitempty"`
    Version   string           `json:",omitempty"`
    OS        string           `json:",omitempty"`
    Arch      string           `json:",omitempty"`
    Hostname  string           `json:",omitempty"`
    Test      *TestResult      `json:",omitempty"`
    Benchmark *BenchmarkResult `json:",omitempty"`
    Pass      *bool            `json:",omitempty"`
    // maybe also test log output should be included?
    Log []byte
}```

Answer 39 · 2015-10-06T01:47:24.000Z

What is Log exactly? Given there is TestResult.Output, I'd expect Result.Log to contain direct writes to os.Stdout in the test binary. Direct writes to os.Stderr would result in Result objects with Log property in stderr of go test.

If we implement framing that @rsc suggested, then Result should probably also have Build string field that would contain anything go build would print. At this time, only -n and -x causes go test to produce extra output. In json format it would be printed out as { "Build": "mkdir ..."}.

I'm proposing raw Build string in contrast to more semantic "Cmd string" for -n/-x because otherwise we would force ourself into a trap that any future go build output must have a field in Result.

I can write a design doc if nobody minds.

Answer 40 · 2015-10-06T01:56:38.000Z

The advantage of putting Version into Start with the simple rule "Only one of Start, Test, Benchmark and End may be present.", is to be clear that Version will be specified only when a test binary of a package starts. With Version right in Result, it is not obvious from reading the struct definition 1) whether Version may be present together with Test? 2) Should a user ignore Version if it is different from the previous value?

Also having Start and End fields as structs clearly signifies current stage.

Answer 41 · 2015-10-06T02:02:44.000Z

Okay, please go ahead and write a doc. Probably good to have it all fleshed out.

Answer 42 · 2015-10-08T00:15:59.000Z

I think streaming should be the default (i.e. we don't need the new -test.stream flag), as handling a bunch of objects is not that difficult than handling a big json array.

Answer 43 · 2015-10-08T00:27:42.000Z

I think go test -json should either

always stream
never stream
not stream, unless -stream
stream, unless -stream=false

In particular, IMO whether it will stream or not, should be explicit.

First two are easier to implement.

I've just filed a proposal, it does not have -test.stream, and proposes (3). PTAL.

Answer 44 · 2015-10-08T00:47:25.000Z

In particular, in the proposal the test binary always streams. I assume your motivation for "stream by default" is to prevent loosing results because of a test binary panic.

Answer 45 · 2015-10-08T00:51:19.000Z

I see no reason not to always stream.

On 8 October 2015 at 11:48, Nodir Turakulov notifications@github.com
wrote:

In particular, in the proposal the test binary always streams. I assume
your motivation for "stream by default" is to prevent loosing results
because of a test binary panic.

—
Reply to this email directly or view it on GitHub
#2981 (comment).

Answer 46 · 2015-10-08T01:14:49.000Z

The proposal has been submitted: https://golang.org/design/2981-go-test-json

I think you should reconsider the split between streaming and non-streaming variants, and just go with streaming every time. The burden of combining streamed output seems small compared to the overhead of having two distinct json output types.

The proposal will be a lot simpler without the -stream flag.

Answer 47 · 2015-10-08T01:40:30.000Z

I agree that it adds complexity.

I considered the product of non-streaming go test -json to become a standard format of test result files in Go. For example, they could be stored for future analysis. I assume that's why Version, Arch, etc were put by @rsc in the first place. It would be nice if the standard file format is valid JSON.

If go test -json output is ultimately to be processed further by another tool (not for storage), then I don't see a reason to include Version, Arch, etc because their values are obvious at the time of the test run.

On the other hand, the standard test results file format could be defined by a separate tool that aggregates go test -json streamed output.

So, I propose to always stream, simplify output format and exclude Version, Arch, OS and Hostname.

Answer 48 · 2015-10-08T02:18:11.000Z

benchcmp and benchstat could be examples of programs that support the standard file format.

Answer 49 · 2015-10-08T02:23:11.000Z

If streaming must be configurable, I suggest that -test.json take a list of comma-separated options. Then we can add more options in the future without changes to the go command.

Answer 50 · 2015-10-08T06:40:46.000Z

proposal updated:

streaming only. Removed -stream flag
output format is greatly simplified; flattened in particular
Version, Arch, OS and Hostname are removed
decreased changes to testing package by making testing.Result internal

Answer 51 · 2015-10-09T02:45:09.000Z

OK to implement?

Answer 52 · 2015-10-09T18:41:27.000Z

What about coverage information?

Answer 53 · 2015-10-09T18:43:16.000Z

Yes, a standard file format should contain coverage, cpuprofile, etc, so I decided to leave this idea for now. It is outside of the scope of this issue.

Answer 54 · 2015-10-09T18:48:36.000Z

Ah, you probably mean the go test output about coverage. I forgot about that. Will add.

Answer 55 · 2015-10-10T00:26:05.000Z

Done.

Please note that the updated proposal also specifies that examples with output are treated as tests and printed as

{
  "Package": "foo",
  "Test": {
    "Name": "Example1",
    "Status": "PASS",
    "T": 1000
  }
}

Status cannot be "SKIP". Never has Log.

Answer 56 · 2015-10-20T20:54:10.000Z

Should I send this proposal to golang-dev or this bug being open is enough for approval/rejection?

Answer 57 · 2015-10-20T21:59:47.000Z

I’d suggest sending an email to golang-dev to ask for more reviewer on the proposal. Thanks.

Answer 58 · 2015-10-22T18:43:46.000Z

What is expected to happen when a test has an os.Exit(99) call, or an unhandled panic (especially from runtime, e.g. "fatal error: all goroutines are asleep - deadlock!")? Till now, it was possible to detect which test exited/panicked in -v mode, because it printed the "header", i.e. === RUN TestPanic.
In the proposal, it seems that the description of the -json flag says "stdout is indented JSON objects" (emphasis mine) — seems a bug, rest of the doc seems to have "unindented"?
Not sure about the printed "\n"s; is the output of a foo.test binary monitored, and "\n"s are added only if needed? otherwise, the example output should have empty lines between the JSON objects, no?

Answer 59 · 2015-10-24T03:09:21.000Z

I don't see anything in the proposal that helps with the thorniest parts of my "comment 14" from 2013. Specifically, if the test crashes with half-emitted JSON, do you get a useful JSON output from the go command? It is not at all uncommon for tests to crash. This case absolutely must work.

I really think maybe we should encourage people to parse the usual text output instead, like go2xunit does.

Answer 60 · 2015-10-24T08:52:35.000Z

@akavel: I addressed your comments in https://go-review.googlesource.com/#/c/16285. Thank you.

@rsc a clean solution could be to use an extra fd exclusively for test binary json output (assuming nothing else in the test binary would write to it). If a test binary crashes in the middle of writing JSON, the go command can safely discard invalid JSON and report anything printed to test binary's stderr.

exec.Cmd.ExtraFiles is not supported on Windows ATM. I will see if I can fix that. This proposal may have to be postponed until that is done.

Answer 61 · 2015-10-24T14:33:16.000Z

@rsc one additional idea that I had was that maybe there could be an official pkg for parsing go test output in the std lib. Then, with any changes in go test output format, the lib would be updated accordingly, and authors of go2xunit-like code wouldn't have to find the changes on their own.

Answer 62 · 2015-11-07T13:38:20.000Z

Is it true that according to the proposal log messages will be available to external tools only after test will be finished?

{
    "Name": "TestBar",
    "State": "PASS",
    "T": 1000000,
    "Log": "some test output\n"
}

In this case it cannot be said about preserving streaming capability. For our tool (https://github.com/go-lang-plugin-org/go-lang-idea-plugin) it would be better to separate log and test-finished events. So we'll be able to show output before test finished and we'll be able to separate output between tests which were run in parallel. E.g. like this:

{ "name": "TestFoo", "event": "started" }
{ "name": "TestBar", "event": "started" }
{ "name": "TestFoo", "event": "log", properties: {"message": "some test output"} }
{ "name": "TestBar", "event": "finished", "properties": {"state": "pass", "T": 1000000} }
{ "name": "TestFoo", "event": "finished", "properties": {"state": "pass", "T": 1000000} }

It would be even better to allow clients to implement their own reporterts and attach them to gotest as shared libraries.

Answer 63 · 2015-12-07T20:58:05.000Z

FWIW, @moshez has started using the JSON Sequence (rfc7464) format for representing test output here at Square. It's extremely simple, and would address @rsc's "comment 14" about interrupted output: the standard specifically addresses ignoring corrupted entries.

Answer 64 · 2016-03-10T03:51:01.000Z

@nodirt I have been reviewing your proposal and have some issues:

Why re-use BenchmarkResult for this purpose? It seems odd to me. I'd rather just type Result with just the new fields common to tests, benchmarks, and coverage. Specific results for tests, benchmarks, and coverage can be provided by fields that are pointers to structs (Test *TestResult, etc).
I think we must encapsulate any test program output in JSON objects so that it may be discarded by any program parsing the JSON objects. The suggestion to use "JSON Sequence" format might be the solution.

Answer 65 · 2016-07-19T21:37:33.000Z

@nodirt do you have any feedback on my comments? I'd like to see this proposal go ahead.

Answer 66 · 2016-08-15T23:29:23.000Z

@nodirt ping?

Answer 67 · 2016-08-16T06:13:46.000Z

A Result struct containing either *TestResult or *BenchmarkResult, with optional *CoverageResult SGTM.

I assume by encapsulating any test program output you imply to replace os.Std{out,err} with os.Pipes (too bad they are not io.Writer), encapsulate the output with JSON object and write JSON objects to real stdout/stderr. We cannot associate these objects with any tests because it won't work in parallel test execution case. This SGTM too.

Overall it sounds like the Result struct is going look like

type Result struct {
    Name  string
    State State
    Procs int    // The value of runtime.GOMAXPROCS for this benchmark run.
    Log   string // The log created by calling (*B).Log and (*B).Logf.

    // these two fields are mutually exclusive
    Test *TestResult
    Benchmark *BenchmarkResult

    CoverageMode string
    TotalStatements  int64
    ActiveStatements int64
    CoveredPackages  []string

    Stdout string
    Stderr string
}

where Stdout or Stderr is non-zero only when something is written to os.Stdout/os.Stderr. If any of them are non-zero, the rest of the fields must be zero.

This struct is not idiomatic at all, rather tailored to a simple JSON output format, so maybe we should give it a more concrete name than generic "Result".

Apologies for the delay.

Answer 68 · 2016-08-17T00:19:46.000Z

That struct looks good to me, I'd put the json:",omitempty" struct tag on each optional field, too.

This struct is not idiomatic at all, rather tailored to a simple JSON output format, so maybe we should not give it a more concrete name than generic "Result".

JSONResult maybe?

Answer 69 · 2016-08-17T07:50:57.000Z

Good to see the topic is active again!

@nodirt any chance to preserve streaming? See #2981 (comment) for the description what's wrong with current proposal version exactly.
The initial issue is about supporting other tools and build servers, I bet all of them would like to have proper results streaming here.

Answer 70 · 2016-08-17T14:33:12.000Z

@nodirt, in #14313, @rsc and I proposed an extended form of the current text benchmark format that, in particular, supports arbitrary metric:value pairs. Since your proposal builds on the existing, rigid testing.BenchmarkResults, we would lose this capability in JSON output. This has proven invaluable in more sophisticated benchmarks such as the x/benchmarks and gcbench since it lets us leverage our existing tools for benchmark analysis for important metrics beyond the limited set understood by the testing package.

Arguably this is an issue with testing.BenchmarkResult, but since we can't change the existing fields, here are a few ways I can think of to address this:

Introduce a JSONBenchmarkResult to parallel JSONResult that consists simply of the number of iterations and a map from string metric names to float64 values.
Make JSONResult.Benchmark a map[string]float64 with the convention that the "N" entry is the number of iterations.
Take advantage of the parallel between structs and maps in JSON and say that other producers are allowed to add fields to the benchmark result. However, this would mean that consumers have to define their own type paralleling JSONResult in order to swap in a map that can accept arbitrary key: value pairs for the benchmark.
Add a map[string]float64 field to testing.BenchmarkResult for "additional metrics". Since this is adding a field, it would not violate the Go 1 compatibility promise.

My other question is how your proposal deals with subtests/subbenchmarks. I assume this is simply part of the Name field and that the string follows the same convention as it does in the text output, but wanted to check.

Answer 71 · 2016-08-17T21:55:55.000Z

@zolotov I believe what is being proposed is a streaming model

Answer 72 · 2016-08-18T01:40:52.000Z

@adg JSONResult and json: ",omitempty" SGTM

@aclements I prefer (4) more than others, but in that case it is probably a separate proposal, since it also probably involves adding Metrics map[string]float64 into testing.B. These two designs can merge without conflicts.

re subtests/subbenchmarks: your assumption is correct

@zolotov

Is it true that according to the proposal log messages will be available to external tools only after test will be finished?

no, in your case the output would be

{
    "Name": "TestBar",
    "State": "RUN",
    "Procs": 4
}
{
    "Name": "TestBar",
    "State": "RUN",
    "Log": "some test output\n"
}
{
    "Name": "TestBar",
    "State": "PASS",
    "T": 1000000
}

except there would be JSON Text Sequence separators.

I'll update the document and clarify the assumptions above.

Answer 73 · 2016-08-18T01:47:54.000Z

Perhaps STARTED is a better state name than RUN. WDYT?

Answer 74 · 2016-08-19T11:18:07.000Z

@nodirt I got it, thank you! Not sure about STARTED or RUN, both are good.

Answer 75 · 2016-08-23T00:37:13.000Z

@nodirt,

I prefer (4) more than others, but in that case it is probably a separate proposal, since it also probably involves adding Metrics map[string]float64 into testing.B. These two designs can merge without conflicts.

That's fine.

More generally, how does/should this proposal incorporate extensibility? Adding metrics is one example, which can be accomplished by adding a Metrics field later on. The other example from #14313 is configuration metadata. This is generally shared across many benchmarks (though, I suppose, doesn't strictly have to be), so it's not immediately obvious to me how it would fit in to the proposed structure.

Answer 76 · 2016-08-24T01:06:12.000Z

@nodirt would you care to update the proposal doc?

Answer 77 · 2016-08-26T18:45:07.000Z

@aclements, how about adding "Metadata map[string]string" in BenchmarkResult? where key follows same rules as configuration metadata. Slightly modified quote:

key begins with a lower case character (as defined by unicode.IsLower), contains no space characters (as defined by unicode.IsSpace) nor upper case characters (as defined by unicode.IsUpper), [text removed]. Conventionally, multiword keys are written with the words separated by hyphens, as in cpu-speed. [text removed].

I believe it is simpler not to support multiple values per key. If there is a key that needs multiple values, users can agree on a separator rune, while the rest of metadata consumers don't have to deal with multiple values.

With string values we lose type safety for numeric metadata values, e.g. but I think it is acceptable, given there may be non-numeric metadata.

@adg apologies for the delay (family reasons). I do care, will update the doc.

Answer 78 · 2016-08-26T19:46:49.000Z

please ignore my previous comment. I completely misunderstood configuration metadata: I assumed it is specified by benchmark code, not go test itself.

Yes, we should definitely add

type JSONResult {
  // Configuration is metadata produced by test/benchmark infrastructure.
  // The interpretation of a key/value pair is up to tooling, but the key/value pair is 
  // considered to describe all benchmark results that follow, until overwritten by 
  // a JSONResult with a non-empty Configuration field.
  //
  // The key begins with a lowercase character (as defined by unicode.IsLower), 
  // contains no space characters (as defined by unicode.IsSpace) 
  // nor upper case characters (as defined by unicode.IsUpper).
  // Conventionally, multiword keys are written with the words separated by hyphens, 
  // as in cpu-speed.
  Configuration map[string]string

  // other fields defined above
}

this differs from the original in that if a config value needs to be changed, a Configuration with all values needs to be emitted again. Rationale: the original proposal does not specify how to undo a key-value pair (an empty string metadata value and no value may have different meanings)

Answer 79 · 2016-08-26T22:05:28.000Z

re Benchmark Name Configuration: I think it should be extracted to a separate proposal, so it can be "inherited" by both this proposal and #14313. I can do it.

Moreover I see testing package recommends to use =, not :.

Update: actually perhaps there is no point in filing a proposal since testing package already established the subtest/subbenchmark name format, that is = as key-value separator. Consumers ofgo test output should try to parse key-value pairs out from benchmark/test names regardless of whether output is text or JSON. If I owned #14313 I'd remove "Benchmark Name Configuration" section.

Answer 80 · 2016-09-25T07:42:25.000Z

How about implementing an ability to customize test output with brand-new plugin package? It will be just the same proposal but json-reporter will be implemented as a plugin.
So anybody will be able to implement any reporter some particular tool/IDE/CI in need.
At JetBrains we successfully use the approach with substituting reporters for JavaScript test frameworks, Ruby, Erlang and many others.

Answer 81 · 2016-09-28T19:16:25.000Z

in https://go-review.googlesource.com/#/c/29157/1/design/2981-go-test-json.md@197 @zolotov raised a concern that with the proposal json format it is impossible to distinguish a test "C" within "B" within "A" from "B/C" within "A", and similar cases. I was hoping to solve this issue by filing a proposal to prohibit "/" in subtest/subbenchmark names, but looks like "/" is supported intentionally and it would be a breaking change.

I am thinking to add NestLevel int to JSONResult` which, if not zero/empty, means that a test/benchmark belongs to a parent printed earlier. E.g.

// Test A started
{
    "Package": "github.com/user/repo",
    "Name": "TestComposite",
    "State": "RUN",
    "Procs": 4
}
// Subtest B in A started
{
    "Package": "github.com/user/repo",
    "NestLevel": 1,
    "Name": "B",
    "State": "RUN",
    "Procs": 4
}
// B passed
{
    "Package": "github.com/user/repo",
    "NestLevel": 1,
    "Name": "B",
    "State": "PASS",
}
// Subtest "B/C" in A started
{
    "Package": "github.com/user/repo",
    "NestLevel": 1,
    "Name": "B/C",
    "State": "RUN",
    "Procs": 4
}
// B/C passed
{
    "Package": "github.com/user/repo",
    "NestLevel": 1,
    "Name": "B/C",
    "State": "PASS",
}

it relies on the fact that a test does not finish until all of its subtests finish.

The alternative is to make Name a list of strings and print absolute name each time, e.g. ["A", "B"], ["A", "B/C"], but name array looks odd to me. WDYT? cc @aclements who raised the question of subtests earlier

Answer 82 · 2016-09-28T19:38:42.000Z

@zolotov re plugins: I like the idea in general, but it is a bit abstract ATM. Are you proposing to limit the plugin to customization of go test's output? A plugin could do more. For example, perhaps in your case a plugin could leave go test output alone, but subscribe to all test events (test started, test finished, etc) and stream them directly to IDEA in IDEA-specific format via a channel most convenient for IDEA. However, I don't know if you would like to have it or not. I personally don't see much value in plugins that are limited to output customization.

I also wonder what others think.

Answer 83 · 2016-09-28T20:31:19.000Z

What about parallel tests? I think we have to include full hierarchical names for each event.

Answer 84 · 2016-09-28T21:17:57.000Z

I don't think we need to worry about B/C inside A vs. A/B/C; my understanding is that those are explicitly intended to be indistinguishable. In practice, people should not be using the same name for two different subtests.

Answer 85 · 2016-09-28T22:23:47.000Z

@minux yes

@quentinmit that was a bad example. The concern is that it is regression. Non-JSON format has indentation by which you can tell whether a test is named "A/B" or "B" in "A". Let me copy @zolotov's message here:

How to distinguish third-level subtest and second-level subtest with a slash in the name? E.g. TestFoo/Bar/Buzz1 and TestFoo/Bar/Buzz2 in the following example:

func TestFoo(t *testing.T) {
    t.Run("Bar/Buzz1", func(t *testing.T) { })
    t.Run("Bar", func(t *testing.T) {
        t.Run("Buzz2", func(t *testing.T) {})
    })
}

At the moment, `go test` prints them with different indents:
--- PASS: TestFoo (0.00s)
    --- PASS: TestFoo/Bar/Buzz1 (0.00s)
    --- PASS: TestFoo/Bar (0.00s)
        --- PASS: TestFoo/Bar/Buzz2 (0.00s)

Answer 86 · 2016-09-28T22:33:08.000Z

My inclination is that we don't need to preserve this indentation at all. https://golang.org/pkg/testing/#hdr-Subtests_and_Sub_benchmarks specifically prescribes the uniqueness of names:

Each subtest and sub-benchmark has a unique name: the combination of the name of the top-level test and the sequence of names passed to Run, separated by slashes, with an optional trailing sequence number for disambiguation.

Answer 87 · 2016-09-28T22:40:14.000Z

I agree. Preserving the nesting makes parsing harder because the parser has to maintain more states. We already know the name for each test, why not just use the full unambiguous name for the events? I think most consumer don't necessarily care about the hierarchical nature of subtests and even if they do, it's trivial to split the name around / to get the hierarchy back.

Answer 88 · 2016-09-28T23:10:33.000Z

I agree with @minux and @quentinmit.

On 29 September 2016 at 08:40, Minux Ma notifications@github.com wrote:

I agree. Preserving the nesting makes parsing harder because the parser has
to maintain more states. We already know the name for each test, why not
just use the full unambiguous name for the events? I think most consumer
don't necessarily care about the hierarchical nature of subtests and even
if they do, it's trivial to split the name around / to get the hierarchy
back.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2981 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIDilcigNrm_kpHrMxlM9vx23ir9Lo1nks5quuz6gaJpZM4DOJNy
.

Answer 89 · 2016-09-29T08:44:55.000Z

@minux @quentinmit

Hi guys! I'm not talking about the same test names. testing package deals with them and gives them uniqueness names. As @nodirt said, it's about regression. The proposal is about providing a machine-readable output format. Since this format is unable to repeat human-readable format – it's regression.

My inclination is that we don't need to preserve this indentation at all.

@quentinmit Are you proposing to remove nesting from go test standard output too?

Answer 90 · 2016-09-29T08:45:03.000Z

@nodirt NestLevel int approach doesn't look reliable to me, especially in case of parallel tests and it's harder to parse, indeed. Also having full qualified test names is important. Name array looks odd to me too.

As a yet another alternative: adding ParentTestName string or ShortName string. Both fields are derived from each other and each of them solves the regression. What do you think?

Answer 91 · 2016-09-29T08:49:12.000Z

@minux

I think most consumer don't necessarily care

I think most consumer necessarily cares.

it's trivial to split the name around / to get the hierarchy back.

To get wrong hierarchy back, it won't represent neither test hierarchy from source code nor from go test standard output.

Answer 92 · 2016-09-29T08:53:21.000Z

Can you provide a concrete example of the problem? I am having trouble understanding.

Answer 93 · 2016-09-29T08:57:15.000Z

@adg the concrete example is here #2981 (comment).

I repeat the code and output:

func TestFoo(t *testing.T) {
    t.Run("Bar/Buzz1", func(t *testing.T) { })
    t.Run("Bar", func(t *testing.T) {
        t.Run("Buzz2", func(t *testing.T) {})
    })
}

At the moment, `go test` prints them with different indents:
--- PASS: TestFoo (0.00s)
    --- PASS: TestFoo/Bar/Buzz1 (0.00s)
    --- PASS: TestFoo/Bar (0.00s)
        --- PASS: TestFoo/Bar/Buzz2 (0.00s)

The problem is that since '/' is allowed in test names and proposed json-format has only full-qualified names of tests, it's impossible to understand that TestFoo/Bar/Buzz1 is subtest of TestFoo and not of TestFoo/Bar.

Answer 94 · 2016-09-29T08:58:57.000Z

And why do we care?

Answer 95 · 2016-09-29T09:00:15.000Z

Because it is regression. It means that I won't be able to recreate proper test hierarchy using json format.

Answer 96 · 2016-09-29T09:06:42.000Z

It's a corner case that only affects people that create sub-tests with slashes in the name. Does anyone actually do that?

Answer 97 · 2016-09-29T09:14:04.000Z

Indeed, it is a corner case, so it's understandable to do nothing with it.

Does anyone actually do that?

In my experience, since it's allowed, for sure there are people who do it; and those of them who use the IDEA will come to me eventually and will demand to show proper test hierarchy in IDE. At this point, I need at least a link to the discussion of this corner case ;)

Answer 98 · 2016-09-29T09:36:22.000Z

@nodirt regarding plugins. Yes, subscribing to all test events and generating output from entities of testing-package is exactly what I thought about.

Here are at least two advantages of this approach:

reporters are written in golang, so they work with go-specific types, they have a whole test runtime and can extract any information regarding the tests they need. So we don't have to change testing package to provide more text-information via json format to third-party tools. We don't have to change go sdk at all, we can change reporters independently.
reporters are written in go, so it's easy to share them. No need to implement converting json testing output to xml-junit format in Java, Python and C# just because different tools are written in different languages.

As for 2, I've already faced it. I implemented the converting gotest output into IDEA-specific test-event-messages and tried to reuse the converter in TeamCity CI because TeamCity uses exactly the same testing messages format. And I failed. I failed because an existing converter uses a lot of IDEA-specific stuff that cannot be used in TeamCity runtime and it's not easy to deal with it right now.