IPLD Resolver `patch` API

Question

IPLD Resolver `patch` API

Opened this issue 9 years ago · 3 comments

We've discussed a patch API, but we haven't settled on how it gets implemented and/or if it gets implemented at this level of IPLD.

tl'dr the patch API would enable the set of values withing an IPLD path, bubbling up the changes and returning the new root CID, very much like object patch API, but for any IPLD Format, which means that interface-ipld-format would have to have a local scope patch function too, and that knows how to traverse objects.

Considerations:

These can be very expensive operations, the longer the path the more nodes it touch (more network round trips, more disk accesses), we need to account, just like MFS, options for : flush/no flush
It might be or not important to have at this level, an API to commit/revert (very much like what @wanderer built for Ethereum with https://github.com/ethereumjs/merkle-patricia-tree)
Note that this API is inevitable to exist, it is just a question if we can make something that serves all the potential use cases, or if it gets built application by application (just like MFS has its own)

Answer 1 · 2016-11-01T16:12:45.000Z

It might be or not important to have at this level, an API to commit/revert

I'm planning on replacing the merkle-tree with this merkle-trie (module naming help anyone?). In the new merkle-trie there is simpler interface. All the commit/revert is replaced by a cache that can be copied instead of checkpointed. The copies don't effect each other so you can use them instead of checkpoints. Then you flush the cache to the store when you are the finished state.

These can be very expensive operations, the longer the path the more nodes it touch (more network

yep! the way im handling this now by creating a tree of operations to apply to the store. It is implemented here The bacth method works as follows.

First it looks up the root vertex in ops tree
it recursively looks up all the changed sub-vertices of the root node in the ops trie
once a leaf op vertices is reached the operation is applied to the real vertex and hashed
once all the sub-vertices return hashes then hash the current vertices and so on

There end result is a tree of promises that resolves when all the work is done. I think its kinda nice, and i would be willing to port to the resolver directly if desired.

Answer 2 · 2016-12-12T04:54:40.000Z

@diasdavid i'm becoming more convinced we cant have a generalized API for this. It might be possible within a single encoding style (like, just for cbor ipld, or just for protobuf dags), but doing things generically for all types is gonna be really hard. We will have to implement the logic separately for each type (extending the interfaces), and even then it becomes hard to actually make things work exactly the way you want to, like does /a/1/c refer to 'c' inside a map that is '1' inside a map that is 'a' in some object? or is the '1' an array index? Theres a bunch of weird edges cases...

Answer 3 · 2016-12-12T05:15:31.000Z

@whyrusleeping it depends entirely on the underlying data model. just like there's tree and resolve you can make put/patch. Some will be well-defined. some will not.

# this makes sense
dag patch $cborobj/foo {}
dag patch $cborobj/foo/bar []
dag patch $cborobj/foo/bar/0 a
dag patch $cborobj/foo/bar/1 b
dag patch $cborobj/foo/bar/2 c
dag get $cborobj/foo/bar
# ["a", "b", "c"]

# this is ambiguous in cbor (map or array?) so fine to error from the codec
dag patch $cborobj/foo {}
dag patch $cborobj/foo/bar/0 a
# error: foo/bar is undefined (is bar a map or an array?)

This is what happens in general programming languages, and is already addressed generically for many formats. Extending to IPLD is actually easier, because most data structures are constrained (so can do even less).