Add heuristic for ordering formulae during computation

Question

Add heuristic for ordering formulae during computation

ondrej33 opened this issue a year ago · 1 comments

Currently, formulae trees are always evaluated by traversing the tree branches recursively from left to right. Similarly, when given several formulae, they are always evaluated in a given order. This is usually not a problem if we only expect the final result. However, in future, it might be useful to obtain partial results for some (sub-)formulae during the computation. Therefore, it may prove useful to order the sub-formulae with respect to their "complexity" and evaluate simpler sub-formulae first (i.e., "AX p" could be considered simpler than "AF p").

To achieve this, we need to:

devise heuristic(s) to order sub-formulae
implement a preprocessing step to apply the heuristic
modify the main algorithm (and other relevant parts, such as duplicate checking) to respect the ordering

For now, I see two approaches how to do it (there are surely more):

A simple version that preserves the recursive tree evaluation and can be easily implemented. When evaluating binary nodes, there is a decision of which sub-tree to choose first. The heuristic can determine which of the two sub-branchs will be evaluated first. This kind of ordering could be pre-computed or done on the fly.
A more complex version that would need the evaluation algorithm to be rewritten. First, we can order all sub-formulae. We could then traverse the tree from the bottom according to this order, starting from all terminals, and going up in various branches (alternating them). In this case, the tree's nodes could contain just the operator + index + indices of children. A pre-computed dictionary would guide us in which nodes to jump to and evaluate.

In the far future, we could also extend this - we can determine sets of "unrelated" sub-formulae and evaluate them simultaneously.

Answer 1 · 2023-09-25T12:58:03.000Z

This raises many good points, I would just add one more note:

In many cases, we can actually speed-up a computation by restricting it to an already known subset of viable results. For example, take the formula (AX a) & (EF b). Here, we know that the result must satisfy both sub-formulas, yet based on the rationale given above, we know that AX a is probably "easier" to compute than EF b. Hence it makes sense to first verify AX a and then verify EF b only on the resulting states.

However, this adds two new issues:

First, this does not necessarily work in every case: if the result of AX a is a set that has a very poor symbolic representation, it can actually slow down the computation of EF, because the computation already starts with this "poor set".
This breaks formula caching is subtle ways, because now each formula is computed for a given "universe" of states, as opposed to the whole state space.

Some other notes:

This can be also applied to disjunction by verifying the "harder" operand only for states where the "easier" one does not hold.
This is also related to patterns like formal x in (formulaA): (formulaB). That is, we explicitly restrict the validity of a quantifier to a know set.