uwplse/pumpkin-pi

Lifting large unpacked constants is very slow

tlringer opened this issue · 2 comments

There should be some optimization for this

I guess this is not really a bug, so much as a consequence that DEVOID's cache doesn't understand unpacked constants, and so when it sees projT1 x in a body, and x is the lifted version of y, but only the unpacked version of x defined by z := projT2 x has been lifted again, it doesn't understand that x has already been lifted and so it does all of that work from scratch.

Consequentially, lift, lift, lift, unpack, unpack, unpack is much faster for large constants than lift, unpack, lift, unpack, lift, unpack, since caching is much better in the first case.

So, e.g.:

Unpack __bst20' as __bst20.
Unpack __bst40' as __bst40.
Lift __bst _bst in __bst20 as _bst20'.
Lift __bst _bst in __bst40 as _bst40'.
Print __bst40.
(* projT2 __bst40' *)
Print __bst40'.
(* existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2
              (projT2 __bst20') (projT2 __bst20')) *)
Print __bst20.
(* __bst20 = projT2 __bst20' *)

When DEVOID gets to lifting __bst40, DEVOID's cache has no clue that it has already lifted __bst20' when it lifted __bst20. That information is stored in the local cache, but not in the global cache, since we don't want to store every single constant that has been lifted already.

I'm not sure what a good way to improve caching for this is, but here are some bad ways:

  • Globally cache literally every constant (huge size explosion)
  • Have Unpack store the correspondence between constants globally, and use this information (but then lifting after Unpack is faster than lifting after unpacking by hand, which is weird)
  • Add a global "unpacked cache" for right projections that stores the info that __bst20 := projT2 __bst20' at lift time (when it goes to lift __bst20)

To use the unpacked cache regardless of how we construct it, when we go to lift something that references __bst20', the unpacked cache will understand that __bst20 is the unpacked version of __bst20'. Thus, lifting can substitute in existT _ (projT1 __bst20') __bst20 for __bst20'. Then supposed we are lifting __bst40:

 existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2
                      (projT2 __bst20') (projT2 __bst20'))`

We can substitute __bst20 for projT2 __bst20':

 existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2 __bst20 __bst20)

And then we can lift the cached versions of those instead.

Regardless, I think this an inconvenience but not too bad, so what I am going to do is add a bit to the README that notes we should unpack after lifting fully, and then plan to address this in the code after.

The large branch has code that is a WIP on this issue, including test cases both ways.