hyperledger-labs/private-data-objects

Address enclave registry cache on restart in the PDO TP

bvavala opened this issue · 3 comments

Currently, when an enclave is registered, the TP creates a verifier (to verify enclave signatures) and caches it in the CCF app main object.
This makes the TP stateful across invocations.

While a stateless TP would be ideal, the cache is not a problem per-se, so long as it is managed as intended (i.e., it caches fresh values and, when a value is not available, tries to retrieve/rebuild that from the persistent KVS).
Such management needs a revision, because the TP assumes (for instance when an enclave is added) that the cache is correctly populated, without checking for value freshness (w.r.t. whatever is stored in the KVS) or for the availability of the value itself in the cache (i.e., it could be missing, though available in the KVS).
This does not raise issues in an always-up single-node deployment.

The main consequences that could trigger errors are:

  • if that single TP instance goes down, then the cache is lost and not rebuilt
  • in multi-node deployments, different end-points may either not have the cache populated, or the cache may be populated with stale values

Note:

  1. The approach of caching the verifier was originally adopted because creating the signature verifier was an expensive operation in CCF.
    Question: is this still an expensive operation with newer CCF versions? (if not, this could be removed)
  2. There is code available to either create a verifier each time, or to use a cached one. Such code could be reused.

This description is not at all clear.

What do you mean by "across invocations"? Invocations as in transactions (which is fine because CCF never restarts, the network must be persistent). Or across multiple nodes in the CCF network? What is cached?

Across transactions on the same CCF app instance. The transaction completes, but the CCF app stays up. In particular, the main object is still valid once the transaction result returns to the user. Within this object, there is a map (see link), which is the cache I'm referring to.

changed from bug to enhancement, since the change requested does not affect TP behavior in a single node TP. Possibly affects liveness of TP (without an appropriate fix) in a multi-node TP, should not affect safety of TP even in multi-node deployment.