filecoin-project/go-amt-ipld

Collapse sparse AMTs

Stebalien opened this issue · 8 comments

Currently, the height of an AMT is always log(last_index)/log(8). Unfortunately, every lookup requires height lookups. Ideally, we'd collapse degree-1 trees by allowing nodes to have a height greater than 1. The downside is an extra byte to record this height in each node.

rvagg commented

Collapsing is a bit hard when we don't store the index with the elements. We can't know without being able to slice an index into all of its components (3 bits per level for this width=8 impl) which particular entry we're at. When the height gets collapsed, we lose track of which pieces of an index we need to discard.

If we stored index along with the value ([index,value] at each leaf) then we could do a check when we reach a point where the full-depth is short-circuited to see if the index we care about is the index found or not (not the same? then the one you want doesn't exist). This would add an int+array for each terminal element being stored in the AMT, which might not be too much of a cost to pay if it means ditching a bunch of intermediates.

Another option is to inline-compact, maybe a rule where if branch only has a single element (or some other number), then don't link, inline it. So you'd end up with nodes that look like (conceptually): [bmap,[CID, CID, [bmap, [entry]], CID]]. So you retain the nested array structure giving you the depth information that you can use to determine the final index. The algorithm would remain stable, you just skip the traversal and end up paying the costs of inlining the full structure but avoid the cost of a CID (or many, because you could nest a lot of these, [[[bmap,[entry]]]]).

Costs everywhere so it'd depend on the nature of the data being stored whether these are worth it.

rvagg commented

Here's an algorithm which might work:

Any height>0 node that only addresses a single terminal element within its sub-tree gets that element inlined in its parent along with the index of that element.

So you'd end up with intermediate nodes that had both links to sub-trees as well as inlined values, (conceptually) like: [CID, CID, [1820422,<value>],CID]. Where the only indexes we need to store are ones at height>0. And you'd get to deal with perverse cases like a single large indexed element only needing a single node for storage (rather than all the blocks needed to form full height).

So as you navigate down through the structure and you hit one of these things that's an array rather than a CID (so these become a kinded union of link and array), you just check the index against the one you're traversing to and that tells you whether this is the entry you care about or if that entry doesn't exist. The canonical form of the data structure would require compaction whenever a deletion created a situation where there was a leaf with only one element (accounting for that makes this a little awkward because you'd need to perform some traversal to discover the full count, but because all sub-trees are like this the traversal would be minimal). Then when you perform insertions and hit one of these compacted nodes you'd need to push it maximally downward until it's on its own or is at height=0.

So there's some funky testing required to ensure that such an algorithm could remain stable and result in canonical form for any given set of data regardless of how you got there.

I was thinking of something a bit simpler. Given a chain of degree one nodes:

     root
    /    \
a: v   b: o 
           \
         c: o
             \
           d: vvv

Collapse the chain as follows:

     root 
    /    \
a: v   d: vvv (height +2)

By encoding "height +2" into node "d", we can avoid having to actually include nodes "b" and "c".

rvagg commented

But I think that it's differentiating from structures like this that's the problem with the simple compaction:

     root
    /    \
a: v   b: o 
         /
      e: o
         \
       f: vvv

The whole index needs to contribute to locating the item, otherwise you can't be sure whether you addressed the right one.

     root 
    /    \
a: v   d: vvv (height +2)

An index 011000000111 where that initial root->d selection is the first part 011 and the d->v selection is the last part 111 would also be matched by a look-up for 011111111111, so you need a way to consider the missing middle bit, or at least check whether the end you've reached is the one you want. When it's fully spanning you can assert that because you're never discarding pieces of data. With a HAMT you are discarding pieces of the hash because you can reach the end before exhausting the hash, which is why we need to store the key with the value and perform the key<>key check. We'd need a mechanism to do the same here.

What if we just stored skipped bits in the node?

rvagg commented

Yeah, that might be nice and explicit, and potentially more space efficient too? So for loading an entry in an existing AMT it'd be something like: if we load a block and it has an optional "bits" field with a value, then interpret them as skipped bits in the index, otherwise just assume the normal height++ bit increment.

Yeah, although we'd need to specify the number of bits. Alternatively, we could always include a prefix and a "mask".