chipmunk-project/chipmunk

Single test fails on Mac passes on cyclops

Closed this issue · 8 comments

python tests/test_iterative_solver.py \ 
    IterativeSolverTest.test_blue_decrease_4_3_sub_cex_mode

On my Mac it outputs following

Total number of hole bits is 436
Sketch file is  blue_decrease_sub_stateless_alu_4_3_codegen_iteration_1.sk
Iteration #1
Synthesis succeeded with 2 bits, proceeding to verification.
Verification failed. Trying again.
Generating counterexamples of 2 bits.
Generating counterexamples of 3 bits.
Generating counterexamples of 4 bits.
Generating counterexamples of 5 bits.
Generating counterexamples of 6 bits.
Generating counterexamples of 7 bits.
Generating counterexamples of 8 bits.
Generating counterexamples of 9 bits.
Total number of hole bits is 436
Sketch file is  blue_decrease_sub_stateless_alu_4_3_codegen_iteration_2.sk
Iteration #2
FAILURE: Failed synthesis at 2 bits.
F
======================================================================
FAIL: test_blue_decrease_4_3_sub_cex_mode (__main__.IterativeSolverTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/test_iterative_solver.py", line 54, in test_blue_decrease_4_3_sub_cex_mode
    '4', '3', '10']),
AssertionError: 0 != 1

----------------------------------------------------------------------
Ran 1 test in 154.884s

FAILED (failures=1)

On Cyclops

Total number of hole bits is 436
Sketch file is  blue_decrease_sub_stateless_alu_4_3_codegen_iteration_1.sk
Iteration #1
Synthesis succeeded with 2 bits, proceeding to verification.
blue_decrease_sub_stateless_alu_4_3_sol_verify_iteration_1.sk passed syntax check.
SUCCESS: Verification succeeded.
.
----------------------------------------------------------------------
Ran 1 test in 988.720s

OK

This keeps failing on my Mac where passes on cyclops.
As far as a I can tell, on different machines, even with the same slv-seed, sketch would find
a different set of hole values, which would lead to different execution paths for chipmunk.

Mac sol_verify file: https://gist.github.com/taegyunkim/71e34965b672e5bad51f862dd22f780a
cyclops: https://gist.github.com/taegyunkim/5d1522acd87cd298d9f83005a4df367c

Giving --parallel option makes the test to pass on both machines.

It's possible that fixing slv-seed results in different behavior on different machines. This is a good catch, and we should look at it in more detail. Even with different behavior on different machines though, it should not change whether iterative_solver succeeds or not.

At the same time, it's unclear why the --parallel flag would fix the problem; it seems coincidental. Let's not merge #129. Instead, let's investigate this issue in depth. For now, let's run all experiments on a single machine.

It's interesting that salu_config values are the same on both sol_verify sketch files.

There is a bug in how we generate counterexamples. By adding an input_offset, we restrict the all inputs to be [2^n, 2^(n+1)), instead of [0, 2^(n+1)).

Consider following program,
https://gist.github.com/taegyunkim/72485453718a09bf7e79c16d45e00bdc

| StateAndPacket | program(| StateAndPacket | state_and_packet) {
  if (state_and_packet.pkt_0 * 2 == state_and_packet.pkt_1) {
    state_and_packet.state_group_0_state_0 = 1;
  } else {
    state_and_packet.state_group_0_state_0 = 0;
  }
  return state_and_packet;
}

After running synthesis in 2 bit input ranges, we should be able to find a counterexample like this
{'pkt_1': 4, 'pkt_0': 2} {'state_group_0_state_0': 0}

However, iterative_solver keeps finding nothing.

$> iterative_solver example_specs/times_two.sk example_alus/if_else_raw.stateful_alu chipc/templates/stateless_alu.j2 3 3 10
Iteration #1
Total number of hole bits is 192
Sketch file is times_two_if_else_raw_stateless_alu_3_3_codegen_iteration_1.sk
Synthesis succeeded with 2 bits, proceeding to verification.
Verification failed. Trying again.
Generating counterexamples of 2 bits.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 3 bits.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 4 bits.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 5 bits.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_5.sk passed syntax check.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 6 bits.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_6.sk passed syntax check.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 7 bits.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_7.sk passed syntax check.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 8 bits.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_8.sk passed syntax check.
Failed to generate counterexamples, z3 returned unsat
Generating counterexamples of 9 bits.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_9.sk passed syntax check.
Failed to generate counterexamples, z3 returned unsat

I have simplified how we generate counterexamples in this branch: https://github.com/taegyunkim/chipmunk/tree/refactor-cex

Basically, instead of splitting the input bit ranges, and having separate sol_verify function, just use generate_counterexample to verify the solution with larger input bits (currently 10 bits). If the solution works, the we are done with compilation. If the solution doesn't work, then we'd get a proper counterexamples.

An example run with my fix goes like this.

$> iterative_solver example_specs/times_two.sk example_alus/if_else_raw.stateful_alu chipc/templates/stateless_alu.j2 3 3 10
Iteration #1
Total number of hole bits is 192
Sketch file is times_two_if_else_raw_stateless_alu_3_3_codegen_iteration_1.sk
Synthesis succeeded with 2 bits, proceeding to verification.
sketch times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_10.sk --bnd-inbits=10 --slv-timeout=0.001 --beopt:writeSMT times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_10.smt2
Verification failed, use returned counterexamples {'pkt_1': 4, 'pkt_0': 2} {'state_group_0_state_0': 0}
Iteration #2
Total number of hole bits is 192
Sketch file is times_two_if_else_raw_stateless_alu_3_3_codegen_iteration_2.sk
Synthesis succeeded with 2 bits, proceeding to verification.
times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_2_bits_10.sk passed syntax check.
sketch times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_2_bits_10.sk --bnd-inbits=10 --slv-timeout=0.001 --beopt:writeSMT times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_2_bits_10.smt2
Failed to generate counterexamples, z3 returned unsat
Compilation succeeded. Hole value assignments are following:

Using sol_verify to directly generate counterexamples is simpler; we should definitely consider that change. However, I am still not sure that that's the reason for the bug. Two questions.

  1. If we generate one cex in the range [2**(n-1), 2**(n)) for all possible n, which we were trying to do, doesn't that cover the entire range [0, 2**(M)), where M is the largest value of n. Are we missing some parts of the input range?

  2. In the output above of iterative_solver, why is this line printed only after counterexamples that are 5 bits or longer: "times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_5.sk passed syntax check."

Using sol_verify to directly generate counterexamples is simpler; we should definitely consider that change. However, I am still not sure that that's the reason for the bug. Two questions.

  1. If we generate one cex in the range [2**(n-1), 2**(n)) for all possible n, which we were trying to do, doesn't that cover the entire range [0, 2**(M)), where M is the largest value of n. Are we missing some parts of the input range?

No, that's not the case. Let's say there are two variables 'a' and 'b'. We generate one cex in the range [2^(n-1), 2^n) for all possible n. However, what if there only is a counterexample such that 'a' is in [2^(n-1), 2^n) and 'b' is in [2^(m-1), 2^m), where m != n. This is what is happening with the given times_two.sk example.

We need to remove input offset passed in chipc/compiler.py#245.

FYI @XiangyuG

  1. In the output above of iterative_solver, why is this line printed only after counterexamples that are 5 bits or longer: "times_two_if_else_raw_stateless_alu_3_3_cexgen_iteration_1_bits_5.sk passed syntax check."

It is a completely separate issue and working as intended.

The 4 bits or shorter sketch files failed to generate counterexamples. Sketch could not solve those corresponding sketch files either.

If you look at chipc/sketch_utils.py#L12, given that the return code was not 0, it only raise an exception when there is a string 'Parse Program Error:'. Those sketch files didn't result in parse program error.

OK. That makes a lot of sense. Thanks for chasing this down. Can you create a PR with the simpler cex generator? I guess we can now remove the counter_example_generator function and use sol_verify for both verification and cex generation?

@taegyunkim: Can you make a PR based on your refactor-cex branch? You might need to rebase because master has changed. Would be good to merge your PR soon so that future experiments include this bug fix.