barefootcoder/Data-Random

Improve test coverage?

Opened this issue · 6 comments

The tests for this distribution take a considerable time to run because of the way they are laid out. The test suite runs over 30000 individual tests to ensure the behaviour of a handful of functions, and still has an overall coverage below 75%.

I was thinking the tests could be made much simpler (and faster) if the exhaustive testing is done only when doing automated testing (as in CPANTesters) or when doing release testing, and make normal users only run the tests that make sure that the functions behave as documented.

This would reduce the size of the test suite and installation times.

I'd be happy to send in a pull request.

As for implementation, and in the spirit of keeping up with the times, how would you feel about using Test::V0 for this?

I looked into this at one point, and the primary reason they're so slow is not because of the number of tests, but rather because every test (in the sense of a verifying a single value) is a separate test (in the sense of generating an ok line in the TAP). I actually had to fix this for one set of tests (I don't recall which one now), because not only was it ridiculously slow, but it also ran out of memory. If a lot of these test blocks were consolidated—that is, have 1,000 or 10,000 validations only generate a single ok—then I think reducing the total number of tests wouldn't be necessary. That is, having 10,000 tests is no biggie if each one takes a millisecond.

Now, given all that:

As for implementation, and in the spirit of keeping up with the times, how would you feel about using Test::V0 for this?

Well, I'm betting that a lot of the reason that running 10,000 very fast tests was super-slow and ate a crazy amount of memory was the original Test::Builder implementation. So simply converting over to Test2 might make a big difference. I'm certainly not opposed to giving that a shot.

I guess what I meant to say is that I'm not sure the tests are testing what they should.

A large number of the tests currently there are calling the same method with the same argument, to see if any of the results obtained will be invalid. But since the results of the methods are random (or close to it), then this approach can at best give an idea of how likely it is that an invalid result will be generated.

In the meantime, they are leaving some easily testable code paths entirely untested: there are relatively few tests for improper input, or missing arguments, or contradictory parameters, etc.

As a regular user, I would probably care more about this aspect of the distribution (= the interface), rather than whether the 10000th iteration of the same method might generate an invalid result. Because even if it did, it is unlikely that most general users will call a given method 10000 in the same process.

What I was suggesting is that we could set some of those more exhaustive tests to run under release testing, or automated testing; and make sure that the tests that are run for regular users, when they install the distribution, make sure to at least cover all of the interface (which could be done with much faster and with fewer tests, by seeding rand and checking under known conditions).

Sorry it's taken me so long to get back to this issue.

I think your observations are spot on. I don't even know if it's useful calling the methods so many times if the results are not going to be reproducible ... personally, I hate it when you get a failing test and then can't figure out what happened because it was a random result. So, I would say, if you're still interested in improving this, I would definitely consider whatever pull request you think is going to make this simpler, and more effectively tested.

No problem!

When I wrote that I already had something sort of put together, but I left it untouched after my last comment.

My idea was basically following this approach:

  1. Set a seed and record what results does that create

    For example, calling rand_date() with srand(12345) always generates 2018-04-27.

  2. Tweak the parameters to see that they affect that particular result in the desired way

    For example, setting min => '2019-04-27' should move that date to the future.

  3. Rinse and repeat until coverage is ~100%

I'll work at it for a bit and try to tidy up things and show you what I've got.

In the meantime, I've discovered a couple of weird behaviours. I'll create issues for those so you can take a look and see how they are to be solved... or if they should be upgraded to features. 🙂

I don't think I have any objections to your suggested plan of attack on the random testing. Feel free to work that up to a pull request at any point.

Took a while, but that's more or less what I had in mind! Comments more than welcome :)