Recursive function implementation

Hi Johnny et al,

I was looking to implement the following (still) pseudo-code, which I have for calculating the probability of a "top-event" in a fault-tree that's been converted to a BDD. This is for Reliability research. Pr(.) represents an assigned probability to each variable (level). Then the probabilities of the nodes are all calculated recursively until we reach leaves according to below. There's a memorization component which I can leave for later.

My question is do you have suggestions on implementing a memorization scheme to make it efficient?

Thanks again, Gui

from dd.autoref import BDD # tulip-dd pure python version
def p(v):
	# if at leaf nodes
    if (v==bdd.true):
        p = Pr[v.level]
    if (v==bdd.false'):
        p = 1 - Pr[v.level]
    # pseudo-code block:
    # else if computed table has entry 
	# else if (computed table has entry for v:
	#	p = result_lookup_table
    # else compute and store
    # end pseudo-code block
    # recursive formula:
    p = Pr[v.level]*p(v.high) + (1-Pr[v.level])*p(v.low)
	# InsertComputedTable(v,p) (pseud-code)
    return p

The algorithm sketch above seems similar to Fig. 10 in http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.6169 .

A Python dictionary mem could be used for memoization, for example:

# Copyright 2020 by California Institute of Technology
# All rights reserved. Licensed under BSD-3.
from dd.autoref import BDD


def compute_probability(u, var_probability, mem):
    # constant node ?
    if u.var is None:
        return 0 if u.negated else 1
    z = rectify(u)
    # memoized ?
    if z in mem:
        r = mem[z]
        r = flip(r, u)
        return r
    # recurse
    p = compute_probability(u.low, var_probability, mem)
    q = compute_probability(u.high, var_probability, mem)
    r = p + var_probability[u.var] * (q - p)
    # memoize
    mem[z] = r
    r = flip(r, u)
    return r


def rectify(u):
    return ~ u if u.negated else u


def flip(r, u):
    if u.negated:
        r = 1 - r
    return r


def test_probability_computation():
    bdd = BDD()
    bdd.declare('x', 'y')
    u = bdd.add_expr('x /\ y')
    var_probability = dict(x=0.5, y=0.6)  # of high edges
    mem = dict()  # cache for memoization
    prob = compute_probability(u, var_probability, mem)
    neg_u = bdd.add_expr('~ x \/ ~ y')
    assert neg_u == ~ u, (neg_u, u)
    prob_neg = compute_probability(neg_u, var_probability, mem)
    assert prob + prob_neg == 1, (prob, prob_neg)
    print(prob)


if __name__ == '__main__':
    test_probability_computation()

Yes, and also Fig. 7 of Rauzy 1993, https://www.sciencedirect.com/science/article/abs/pii/095183209390060C, which precedes, I think. But notice my implementation is slightly different when reaching constant node (leaves).

	# if at leaf nodes
    if (v==bdd.false):
        return Pr[v.level]
        
    if (v==bdd.true):
        return (1 - Pr[v.level])

Could you explain the need to flip and rectify? I noticed my original implementation produced the complement of the probability I was seeking.

I also have to test for scalability with the memory table implementation. Do you think it should be able to handle 100s of variables? Or do I need to use the lower-level implementation to scale?

Each node in a BDD points to two nodes, high and low, corresponding to the values of the variable at that level. Each of these two pointers is a node reference. A node reference can be regular (as called in CUDD) or complemented. This notion of complement is the same as for external user references to nodes.

Only regular nodes are stored in a BDD manager. Nodes for the complement of each Boolean function that corresponds to a regular node need not be stored. They are represented by complemented references to regular nodes. Thus, in a BDD manager that implements complemented edges (i.e., complemented references), complementation takes constant time.

The function rectify is needed to obtain a regular node reference from a complemented node reference. The cache mem stores only regular nodes, because a complemented node reference corresponds to a complement of a Boolean function, whose probability can be computed in constant time from the probability of the corresponding "regular" Boolean function. So function rectify "uncomplements" a node reference, if it is complemented (negated).

The function flip does the opposite. If the node was complemented, then it changes the computed value (here it is a probability) from that corresponding to the regular node (regular Boolean function) to the value that corresponds to the complement node (complement Boolean function). The complement of a Boolean function is TRUE for variable assignments (outcomes) for which the regular Boolean function is FALSE, and vice versa. So the complement Boolean function describes the complement of the event described by the regular Boolean function. Thus, the probability of the complement Boolean function is 1 - r, where r is the probability of the regular Boolean function.

The memory table is mem. BDDs defined over hundreds of variables could be small or large. In applications that I have worked on, BDDs over hundreds of variables were usually large and required dd.cudd to work with them. The code above (#62 (comment)) works also with dd.cudd.

The dictionary mem will keep in memory a Function instance for each node of the BDD whose event probability is being computed. This could increase memory consumption (for thousands of nodes in a single BDD), though it may not be an issue for successfully completing the computation, assuming the machine has sufficient memory. An optimization is possible to save memory, by storing integers instead of Function instances in mem, similarly to how the module dd._copy works:

dd/dd/_copy.py

Lines 56 to 63 in 890d191

    
           # rectify 
        
           z = _flip(u, u) 
        
           # non-terminal 
        
           # memoized ? 
        
           k = int(z) 
        
           if k in cache: 
        
               r = cache[k] 
        
               return _flip(r, u)

The lower-level Python implementation would not be used in this case, because of working with dd.cudd. The C implementation in CUDD would be used via the Cython module dd.cudd. The C implementation is more efficient than the pure Python implementation.

I tested your algo and it runs over 100 times faster than mine. So far with 30 variables it solved the prob under .002secs. Not sure if it's just the memorization or the main recursive part. Mine is recursive too but I hadn't been able to implement the memorization. The results are the same.
Next I need to scale it to 100 variables and see how it does. Tulip is indeed a great tool for BDD analysis.
Many thanks, Gui

When attempting to use this version with 2000 variables, I eventually ran into a:

RecursionError: maximum recursion depth exceeded in comparison

Is this something a dd.cudd could deal with? Or are these inherent to Python and have to go to C/CUDD maybe?

The implementation above (#62 (comment)) is recursive, and Python has a builtin limit on recursion that can be changed, as described below. Any recursive function can be written as a non-recursive function, so in principle this error can be avoided by rewriting the function. Alternatively, the recursion limit can be changed as described below.

A workaround for the RecursionError due to the recursion depth being exceeded could be to increase the recursion limit. The current value of the limit can be found with (documented here):

import sys

old_recursion_limit = sys.getrecursionlimit()
print(old_recursion_limit)

and the recursion limit can be changed with (documented here):

new_recursion_limit = 5000
sys.setrecursionlimit(new_recursion_limit)

where new_recursion_limit is a suitable value larger than old_recursion_limit.

There can be an implementation-imposed limit to recursion in C, as discussed here, and measured here.

There is a limit to how many variables can be defined in CUDD. From the CUDD User's Manual:

On a machine with 32-bit pointers, the maximum number of variables is the largest value that can be stored in an unsigned short integer minus 1. The largest index is reserved for the constant nodes. When 64-bit pointers are used, the maximum number of variables is the largest value that can be stored in an unsigned integer minus 1.

Using a cdef function implemented in Cython seems to be an alternative, as experimented with here.

Please feel free to open another issue.

Hi Ioannis,
Sorry for the scope creep! But can I ask one last question on this thread? If I had two variables I was simultaneously computing recursively, could I store the two together in the memo cache? How would I do so for example in the pseudo-code below (A,v):

Thanks,
Gui

If there are two quantities being computed that depend on nodes, the tuple of their values is a function of the node, so it could be stored in the cache.

Do the flip and rectify procedures apply the same to any other variable depending on the nodes as well?

If there is a function that maps the value of the quantity being computed from complement nodes to "positive" nodes (only these are stored in the implementation, and complemented edges represent the complement nodes), then this function can be used inside flip. In the example above (#62 (comment)) the quantity is a probability r, and this function is 1 - r (the probability of the complement of an event is one minus the probability of that event). The function rectify would be unchanged.

If there is a quantity with no such function (among the quantities being computed), then the flip and rectify functions would not be used. In this case, the memoization cache would have complemented edges (the u) as keys. Also, computation of the quantity's value for a complemented edge (a u with u.negated is True, representing a complement node) would need to take into account the u.negated to distinguish from the case with u.negated is False (a "positive" node). The reason is that the successor nodes u.low and u.high are the same whether u.negated is True or False.

I tried below for the tuple function, but am getting an error message that I'm trying to subtract a tuple from an integer in the line r = p[0] + prob[u.var] * (q[0] - p[0]). Not sure why since I'm indexing the tuples.

def recurse_prob(u, prob, failr, mem):
    # constant node ?
    if u.var is None:
        return (0,0) if u.negated else (1,0)
    z = rectify(u)
    # memoized ?
    if z in mem: 
        (r,v) = mem[z]
        r = flip(r, v, u)
        return r,v
    # recurse
    p = recurse_prob(u.low, prob, failr, mem)
    q = recurse_prob(u.high, prob, failr, mem)
    
    r = p[0] + prob[u.var] * (q[0] - p[0])
    v = (1-prob[u.var])*failr[u.var]*(q[0]-p[0])+(1-prob[u.var])*q[1]+prob[u.var]*p[1]
    print(r,v)
    # memoize
    mem[z] = (r,v)
    (r,v) = flip(r, v, u)
    return r,v

def rectify(u):
    return ~ u if u.negated else u

def flip(r, v, u):
    if u.negated:
        r = 1 - r
        v = 1 - v
    return r,v

It appears that the lines

        r = flip(r, v, u)
        return r,v

result in returning a tuple (r, v) where r is a tuple, because the function flip returns a tuple. So when p is returned when found to be memoized, p[0] is a tuple, hence the error about attempting to subtract a tuple from an integer. Changing the first of these lines to r, v = flip(r, v, u) is expected to avoid this error.

Works like a charm, as expected. Thanks once more.

For what it's worth, I was able to implement the function without "flip" and "rectify". I also used a different constant node comparison. It seems to work. Does the following make sense to you, @johnyf ?

def recurse_prob(u, var_prob, mem):
    # constant node ?
    if (u == bdd.false):
        return (1,0)
    if (u == bdd.true):
        return (0,0)
    # memoized ?
    if u in mem: 
        r,v = mem[u]
        return r,v
    # recurse
    q = recurse_prob(u.low, var_prob, mem) #false
    p = recurse_prob(u.high, var_prob, mem) # true
    #r = p[0] + var_prob[u.var] * (q[0] - p[0])
    r = (1-var_prob[u.var])*q[0]+var_prob[u.var]*p[0]
    v = (1-var_prob[u.var])*failr[u.var]*(q[0]-p[0])+(1-var_prob[u.var])*q[1]+var_prob[u.var]*p[1]
        
    # memoize
    mem[u] = r,v
    #r = flip(r, u)
    return r,v

Yes, this makes sense. The memoization cache is used in this case to store per edge information, instead of per node (for each stored node there exist at most two edges: one complemented, the other "positive"). As a result, this cache may grow larger in size.

The constant node comparison is more expensive than the comparisons u.var is None and u.negated, because it instantiates an object dd.cudd.Function to create bdd.false and bdd.true (almost) each time recurse_prob is called.

I think that the constant cases have been swapped compared to above (#62 (comment)), with probability 1 returned for the event FALSE. The previous version would be:

if (u == bdd.false):
    return (0, 0)
if (u == bdd.true):
    return (1, 1)

(Assuming that the terminal case for the second computed quantity is 1 for TRUE, because 0 seems to not agree with how flip is defined above (#62 (comment). In other words, applying flip to the terminal node would return 1 - 1 = 0 for FALSE.)

Hi @johnyf ,

What I tried above isn't working well. It miscalculates both r and v in certain cases. Now I'm trying to revert back to your implementation with flip, rectify (below). However (in some cases, not all) I get an invalid index to scalar variable at r = p[0] + var_prob[u.var] * (q[0] - p[0]) because p becomes a scalar, not sure why... When it works, it seems to produce the correct results.
Can you identify anything wrong in below? G

def recurse_prob(u, var_prob, mem):
    # constant node ?
    if u.var is None:
        return (0,0) if u.negated else (1,0)
    z = rectify(u)
    # memoized ?
    if z in mem: 
        (r,v) = mem[z]
        (r,v) = flip(r,v, u)
        return r
    # recurse
    p = recurse_prob(u.low, var_prob, mem)
    q = recurse_prob(u.high, var_prob, mem)
    r = p[0] + var_prob[u.var] * (q[0] - p[0])
    v=(1-var_prob[u.var])*failr[u.var]*(q[0]-p[0])+(1-var_prob[u.var])*p[1]+var_prob[u.var]*q[1]
    # memoize
    mem[z] = (r,v)
    (r,v) = flip(r,v, u)
    print(z.var,(r,v))
    return (r,v)

def rectify(u):
    return ~ u if u.negated else u

def flip(r,v, u):
    if u.negated:
        r = 1 - r
        v = 1 - v
    return (r,v)

Oops, I can answer my own question. There's a return r instead of return r,v.

Thanks!
G

	# rectify
	z = _flip(u, u)
	# non-terminal
	# memoized ?
	k = int(z)
	if k in cache:
	r = cache[k]
	return _flip(r, u)