HypothesisWorks/hypothesis

Hypothesis shrinking doesn't find minimal example

boustrophedon opened this issue · 5 comments

As I mentioned in IRC, I had a dumb idea: Use hypothesis stateful testing to solve the towers of Hanoi problem.

It actually works:

Falsifying example: run_state_machine(factory=HanoiSolverSM, data=data(...))
state = HanoiSolverSM()
state.move(data=data(...))
Draw 1: (0, 1)
state.move(data=data(...))
Draw 2: (0, 2)
state.move(data=data(...))
Draw 3: (1, 2)
state.move(data=data(...))
Draw 4: (0, 1)
state.move(data=data(...))
Draw 5: (2, 0)
state.move(data=data(...))
Draw 6: (0, 2)
state.move(data=data(...))
Draw 7: (2, 0)
state.move(data=data(...))
Draw 8: (2, 1)
state.move(data=data(...))
Draw 9: (0, 1)
state.teardown()

but 9 moves is not minimal (it takes 2^n-1 moves for n rings). You can see that in draws 6 and 7 it just shuffles one ring back and forth.

In other runs, it does find a minimal example.
Full repro here: https://gist.github.com/boustrophedon/a0905a18ef45dea4a5b36b6cead54222


Also, I was wondering if there's a better way to structure this (maybe with the composite decorator?) such that instead of printing the draws it could just print move(0,1) somehow. There's a hook to print something extra on the line when you data.draw, so maybe I could implement a printer parameter for draw that takes a function with the results of the draw as input and prints it out/returns a string?

I've recently (#1790) written up a guide to shrinking - here's the current version. You can skip the internals section at the end, but I think the rest is worth reading. Specifically:

  • Confirm that a minimal solution is really worth the extra work it will take. It may well if you'll learn something, or for a demo!
  • Towers of Hanoi and similar games are all about the dependencies on earlier moves, so it's not surprising that it doesn't shrink well.
    • As a specific tip, you could count a move as invalid if it reverses the previous move; or more generally if it puts the towers back in a previous state. That would exclude many non-minimal movesets by construction.
    • Using a strategy that can make any move, then filtering, may shrink better than only choosing among valid moves. The cost is that generation may be less efficient.

Hope that helps 😄

(closing the issue as I don't think there's any action we need to take in Hypothesis itself)

(closing the issue as I don't think there's any action we need to take in Hypothesis itself)

I've been thinking about the possibility of adding a “filtered sampling” strategy that allows end-users to use shrink-open techniques in a controlled and user-friendly way.

For suitable strategies, this should give better results than current post-filtering or ad-hoc approaches.

However, I can open a dedicated issue or PR once I've had a chance to experiment with this idea.

I actually asked @boustrophedon to file this, as I thought it was a nice example and it's a shame that it doesn't work out of the box. I don't think it's high priority, and the right thing to do on @boustrophedon's end probably is to change the implementation, but I think it'd be worth investigating what's going on here because it might point to something interesting.

I've been thinking about the possibility of adding a “filtered sampling” strategy that allows end-users to use shrink-open techniques in a controlled and user-friendly way.

I like this idea a lot FWIW.

@DRMacIver - I think a general solution to this would be to run something like the block programs over examples at each depth - deleting two adjacent rule calls (XX) would ensure that this found the minimal example.