Reasoning about remainders
Opened this issue · 3 comments
As part of exploring CSE in Kontrol, a few issues have come up. Specifically:
- The backend exhibits spurious branching because it's not able to prove a large remainder (11 branches) UNSAT. An initial investigation with @geo2a revealed that the satisfiability of the remainder is not checked with respect to the current path constraint. The corresponding bug report is attached: it exhibits a 9-way branching when only an 8-way branching is expected).
- The remainders are not minimised, in the sense that even though some branches have been cut as infeasible, we still get the negation of their constraints in the remainder (in the example, the branching is initially 11-way and then three branches are discarded).
- The remainder is not propagated back using
rule_predicate
, which leads to a mischaracterisation of the branching bypyk
, and causes an infinite loop in combination with 1).
From what I understand, the remainders are initially of the form #Not ( ... #Or ... )
, which is then transformed into #Not (...) #And #Not (...)
. I believe that there is room for further simplification or minimisation at the #Or
stage, perhaps even Z3 could be asked to minimise this before the transformation to #And
.
With the georgy/booster-remainders branch, Booster can now apply the rules from the bug report, but it still produces a 9-way (instead of 8-way) branch. I.e. the behaviour is on-par with Kore. I have some thoughts on why this happens and also some wants about the rules. I start with the wants and then describe a possible transformation to the rule format that may help us to prune the remainder.
Format of CSE rules
-
Suggested changes to CSE rules
- the rules must preserve definedness and should be marked as such, otherwise Booster will not be able to apply them
- the rules need to have IDs. I do not think it matters much if they are not actually UniqueIDs, but it is very annoying to not have them, as it becomes very hard to disambiguate the rules in the backend logs
-
Experiment: shift information to the side condition
- the rules have function symbols in the LHS, which makes them invalid rewrite rules and more akin to simplifications.
- This actually may be the reason why the remainder condition is over-approximating. Once we've matched with the rule and checking the side condition (or checking the remainder after applying several rules), we have already lost the information. Perhaps if we had the matching information when computing the remainder, it would not over approximate.
- With that in mind, we should consider transforming the rules in the following way:
- bind the expressions in the LHS to a variable
- add an equality (likely
==K
should work) constrains
This transformation of the rule may just work and the rules apply without a remainder.
We can of course extend the matching code in Booster to allow (total) function symbols in LHS of rewrite rules (experiment in this PR), but this does not actually solve anything, as we'd not have enough information to prune the remainder.
- the rules have function symbols in the LHS, which makes them invalid rewrite rules and more akin to simplifications.
CSE rules are specifically designed to have only symbolic variables on the LHS, with the exception of +Bytes
and map concatenation. If you see any other ones, please let me know.
+Bytes
are fine, we have an excellent array of rules that does unification on them, which is deterministic. Concatenation of maps is something to be implemented.
In addition, my experience tells me that we should never be introducing additional constraints into the path condition unless absolutely necessary, and this is not necessary for +Bytes
and should not be necessary for map concatenation.
The remainder condition that we are dealing with is over-approximating because in the path constraints, some symbolic variables are initialised in some of the branches, causing the parameters of function symbols to be computed partially or even the function to be fully evaluated. This issue, I believe, is not related to what's in the structural part.