rems-project/sail

SMT backend: performance issues when composing types and mappings

Opened this issue ยท 10 comments

The following Sail takes a long time to generate SMT with the latest sail2 branch, and generates a 1MB file when it does. Increasing the number of fields in MyStruct seems to cause things to grow exponentially:

default Order dec

$include <vector_dec.sail>
$include <option.sail>

struct MyStruct = {A : bool, B : bool}
enum MyType = {OptA, OptB}

mapping mytype_encdec : MyType <-> bits(1) = {
  OptA <-> 0b0,
  OptB <-> 0b1
}

mapping my_mapping : bits(1) <-> option((MyStruct, MyType)) = {
  0b0 <-> Some(struct {A = false, B = false}, mytype_encdec(0b0)),
  0b1 <-> Some(struct {A = true , B = true},  mytype_encdec(0b1))
}

$property
function prop(s : MyStruct, v : MyType) -> bool =
  my_mapping_backwards_matches(Some(s, v))

Sorry about the weird mix of enums, structs, and options: all of them seem to be required to demonstrate the full problem.

There's an optimisation I've been meaning to implement for a long time which I think will solve this. We need to detect when control flow constructs (if, match, etc) have no side effects and convert them to 'eager' variants, i.e. let x = if b then y else z should become let x = eager_if(b, y, z) when y and z are pure.

The problem is all the nested matches in the mapping logic is essentially compiled under the assumption that every branch might have some side effect, which produces a lot of control flow edges we don't need, and the main thing we want to optimise for when generating SMT is a reduction in control flow choices. Sometimes the simplifier can clean this up but when it fails to do so you just get a giant SMT problem.

Should be improved quite a bit by: #693

$ time sail --smt --smt-auto-solver z3 issue689.sail
Checking counterexample: prop.smt2
Solver found counterexample: ok
  s -> struct {A = false, B = false}
  v -> OptB
Replaying counterexample: ok
        0.15 real         0.12 user         0.03 sys
$ wc prop.smt2
      23     440    5158 prop.smt2

Awesome, thanks so much!

There were some issues with the simplification rules which could affect this kind of property, you might want #694 as well just in case.

Thanks! There still seems to be some unfortunate scaling in runtime: if I change the bits(1) to a bits(3) and gradually add mapping clauses, with four clauses it takes about 4 seconds, and with a fifth clause it takes > 20 minutes (until I killed it). At least it doesn't use as much RAM as it goes now though. Apologies if this is just too complicated for the backend: appreciate that I'm composing quite a lot of complicated features, although I have a suspicion this is the last obstacle to getting all the properties working.

If you post the slower variants I can take a look.

I think part of the problem is that each mapping application is guarded by a 'does this mapping apply' guard, and this means some of the matches aren't obviously complete. When this occurs there is a hidden 'match failure' branch which inhibits a bunch of simplifications from occuring.

default Order dec

$include <vector_dec.sail>
$include <option.sail>

struct MyStruct = {A : bool, B : bool}
enum MyType = {OptA, OptB}

mapping mytype_encdec : MyType <-> bits(1) = {
  OptA <-> 0b0,
  OptB <-> 0b1
}

mapping my_mapping : bits(3) <-> option((MyStruct, MyType)) = {
  0b000 <-> Some(struct {A = false, B = false}, mytype_encdec(0b0)),
  0b001 <-> Some(struct {A = false, B = true }, mytype_encdec(0b0)),
  0b010 <-> Some(struct {A = true , B = false}, mytype_encdec(0b0)),
  0b011 <-> Some(struct {A = true , B = true }, mytype_encdec(0b0)),
  0b100 <-> Some(struct {A = false, B = false}, mytype_encdec(0b1)),
}

$property
function prop(s : MyStruct, v : MyType) -> bool =
  my_mapping_backwards_matches(Some(s, v))

This is the >20 minute one. Commenting out the last clause takes about 4 seconds.

It seems to be spending all its time in Sail, so something is definitely going wrong. With even a single line:

0b000 <-> Some(struct {A = false, B = false}, mytype_encdec(0b0)), 

The mapping gets de-sugared to a 'core' functional representation like:

[bool]let [option((MyStruct, MyType))]head_exp# = [I/option((MyStruct, MyType))]arg# in
$[complete][bool]let [option(bool)]ga#3 = $[complete][option(bool)]match [I/option((MyStruct, MyType))]head_exp# {
  [option((MyStruct, MyType))]Some((struct { A = [bool('ex330#)]p0#, B = [bool('ex331#)]p1# }, [MyType]mapping0#)) if [bool]let [bool]ga#2 =
  [bool]let [bool]ga#1 = [bool]mytype_encdec_forwards_matches([I/MyType]mapping0#) in
    [I/bool]ga#1 && $[overloaded { "name" = "==", "is_infix" = true }][bool]eq_bool([I/bool('ex331#)]p1#, [bool(false)]false)
  in [I/bool]ga#2 && $[overloaded { "name" = "==", "is_infix" = true }][bool]eq_bool([I/bool('ex330#)]p0#, [bool(false)]false) => [option(bool)]let [bitvector(1)]ga#0 = [bitvector(1)]mytype_encdec_forwards([I/MyType]mapping0#) in
  $[complete][option(bool)]match [I/bitvector(1)]ga#0 {
    [bitvector(1)]b__0 if $[overloaded { "name" = "==", "is_infix" = true }][bool]zz7nzJzK1#eq_bits([I/bitvector(1)]b__0, [bitvector(1)][[bit]bitzero]) => [option(bool)]Some([bool(true)]true),
    _ if [bool]true => [option(bool)]None([unit]())
  },
  _ if [bool]true => [option(bool)]None([unit]())
} in
$[complete]$[mapping_match][bool]match [I/option(bool)]ga#3 {
  [option(bool)]Some([bool('ex338#)]result) if [bool]true => [I/bool('ex338#)]result,
  [option(bool)]None(_) if [bool]true => $[incomplete][bool]match [I/option((MyStruct, MyType))]head_exp# {
    _ if [bool]true => [bool(false)]false
  }
}

so with additional clauses the control-flow quickly becomes a rat's nest of complexity. I think the way it uses options to represent the intermediate match results is blocking any simplifications so I might try to work on that.

Part of the issue is we rather pessimistically assume all functions associated with mappings may have an incomplete match. Removing that assumption is currently unsound without a more accurate analysis of the mappings, but when combined with some other small improvements:

$ time sail --smt --smt-auto-solver z3 slow.sail
Checking counterexample: prop.smt2
Solver found counterexample: ok
  s -> struct {A = false, B = true}
  v -> OptB
Replaying counterexample: ok
        0.15 real         0.11 user         0.02 sys

#704 improves the generation to avoid the blowup in the SMT size caused by complex control flow by introducing intermediate variables for commonly duplicated parts of the path conditions. This isn't as dramatic a speedup for this particular problem as if we had accurate information about mapping completeness, but does still reduce the runtime considerably:

$ time sail --smt --smt-auto-solver z3 slow.sail
Checking counterexample: prop.smt2
Solver found counterexample: ok
  s -> struct {A = true, B = false}
  v -> OptB
Replaying counterexample: ok
        2.36 real         2.30 user         0.04 sys