Don't store operation contexts in the KV
plietar opened this issue · 1 comments
In #99, we introduced long running asynchronous operations, which we use to perform DID resolution.
Asynchronous operations are implemented as two transactions, one which triggers the operation in response to a client request, and a second which completes the operation, in response to a request to the callback endpoint.
On the first request, we need to serialize the operation's state (in this case, the COSE claim that is being submitted) and propagate it to the second request one way or another. At present, we achieve this by writing the operation context into the KV, in the operations
table, but this has some downsides: it pollutes the ledger with unverified data (even if in a different table), and it requires us to make a historical query to get it back.
We can replace this mechanism by passing the context data to the DID fetch scripts, and have it give it back to us on the operation's callback.
Passing context to the fetch script
We currently pass the requested URL and nonce to the fetch script on its command line argument. This is suitable for small values, but a 1MB (plus the base64 overhead) COSE claim would exceed the kernel's limit on command line lengths. The other option is to pass the context on the fetch script's standard input. This is not yet supported by CCF's host process interface, but CCF can be extended to support it.
Integrity checks
We would likely still need to check for integrity of the context, to make sure the external process has not tampered with it. The easiest way to do this would be taking a hash of the context and storing it in the KV. This would be much smaller than the context itself, and can therefore be kept in memory in the existing indexing strategy, allowing immediate access.
Another option for integrity would be to add an HMAC to the context, issued with a secret key that is only known to the ledger. Through this, the callback endpoint would not need to look up any state from the previous operation at all. While this is reasonably easy to do at a single node level, it might need help from CCF to manage secrets across the cluster.
WIth #136, we now have the option of passing data onto the fetch script's standard input, which allows us to implement this.
Also we've decided using an HMAC to check the context's integrity would be over engineered, and therefore will stick to storing a hash in the KV and keeping it in memory using an indexing strategy.