Stream Anchor capabilities when iterating over cluster view
lthibault opened this issue · 1 comments
The current host-anchor implementation introduces a fair bit of complexity that could (in time) be handled by capnp's 3PH.
@aratz-lasa In the short-term, the proposed changes have moderate but direct impact on downstream consumers, so I want to be sure we take the time to discuss this. In particular, I want to make sure we aren't going to undermine any of our existing projects (or that we have good workarounds until 3PH lands).
Problem
Consider the following code.
it := n.Ls(ctx) // n is a client.Node
for it.Next() {
if host := it.Anchor(); hasSomeProperty(host) {
doFoo(host.Walk("/foo"))
}
}
We can ignore the details of how we are selecting hosts, and what we are doing with them. The essential part is that we performing doFoo
on only some of the hosts in the cluster. Since we are not performing operations on every host, we do not need to create an Anchor
capability for each host This creates an opportunity for significant optimization because creating an Anchor
capability involves two sub-operations that become costly at scale:
- establishing network connections between peers; and,
- mutating the cap table on both peers (amortized
O(1)
time, andO(n)
space).
Current Solution and Limitations
Rather than stream Anchor
capabilities back to the client, we stream records that contain routing information for each host in the global view. We then use this routing information to construct a special Anchor
implementation that lazily dials its remote host when its methods are called for the first time. This is a "perfect" optimization, since it avoids both sub-operations unless the anchor capability is actually being used. It neither creates surplus network connections, nor modifies cap-tables needlessly.
The main drawback is that this "lazy-dial" approach involves extra state management. It requires us to
- track the connection state of each host anchor,
- maintain a reference to the underlying
rpc.Conn
after it has been established, and manage its lifecycle; and, - intermix error-handling logic for the networking, session and application layers.
Overall, this approach comes at the cost of extra state-management, as well as a mild blurring of system boundaries. It also has the unfortunate effect of placing this extra complexity in high-level code (pkg/client
rather than, say, pkg/vat/ocap
).
Solution
If we are willing to tolerate additional load on the cap table, I think it is possible to both
- simplify the host dialing logic; and,
- move state-management to lower-level packages.
To do so, the host servicing the the View
RPC call need only associate an Anchor
capability with each record streamed back to the caller. As before, superfluous network connections are avoided by lazy dialing. Dialing logic is simplified by delegating the construction and lifecycle-management of remote host-anchor capabilities to the cluster.Host
type, which
- is simpler in its internals than the higher-level
client.Node
, - is not encumbered with the task of abstracting over the Cap'n Proto RPC API,
- represents network vats as first-class objects; and,
- is the ideal layer at which to implement additional Host-level optimizations, for example: caching instances of
rpc.Conn
.
Caveats and Mitigation
Cap-Table Contention
As noted above, the proposed solution adds entries to both the sender's and the receiver's cap tables for each record that is transmitted by a call to Host.View().Iter()
. In the worst-case analysis, this exhibits O(n) complexity both in overall memory usage and heap-object count. Note that the cap tables at both ends of an rpc.Conn
are affected. This problem is however attenuated by
- the small size of clusters currently in production,
- flow-control properties of our batch-streaming API; and,
- the existence of passive mitigation strategies ranging from
sync.Pool
to the use of specialized datastructures in therpc.Conn
cap table.
To this first point, we can expect the size of the cap table to stabilize on some asymptotic value for large routing tables as unused capabilities from previous batches are released. The exact value of this asymptote is likely a simple function of batch-size and network RTT. An arbitrary upper bound can therefore be enforced through go-capnp's existing flow-control API.
More generally, full table scans are inherently O(n), so it's expected that applications will try to avoid this by filtering the view on the server-side, in a manner analogous to classical DB queries. To this end, our first line of defense is the enriched query API proposed in #36.
It should lastly be noted that rpc.Conn
is undergoing heavy development, and that opportunities for improving performance (e.g. through reduced lock contention) almost certain to emerge.
Object Proxying and Third-Party Handoff
An important side-effect of the proposed refactoring is that all calls to anchors obtained via the View
capability will be proxied through its host. In practice, this means proxying through the host to which a given client.Node
is connected.
This is a perfect target for Cap'n Proto's "Third-Party Handoff" (3PH), which can transparently reduce the network path to a single hop. Level-3 RPC support in go-capnproto is planned, and implementation efforts are estimated to begin in Q1 of 2023.
In the meantime, the main factor to consider is that the proposed solution implies a commitment to 3PH in the medium-term future. The acute need for 3PH will manifest as application-level stability issues due to a single point-of-failure, and to a lesser extent as high latency due to the proxying of RPC calls.
Another consideration is security. Presently, anchor.Capability
is exported directly at the vat.Network
level, in its own stream handler. This means that it can be arbitrarily bootstrapped by any peer capable of dialing the vat's underlying libp2p Host
, which in turn makes it trivial to escape confinement to a particular anchor subtree.
Obviously this isn't a problem until we actually implement authentication for client capabilities, but it does influence the design decisions for #16. As noted in that issue, per-capability stream handlers may improve performance by taking full advantage of non-blocking QUIC streams. On the other hand, they almost inevitably increase the size of the auth boundary, since each libp2p protocol endpoint must be guarded individually.1 For this reason, I am increasingly opposed to this approach.
To recap, the "pros" for streaming anchor capabilities directly from View
are now:
- simplification of host-dialing logic
- migration of connection-state management out of high-level API code
- simplification of capability authentication
The "cons" are now:
- possible cap-table contention
- commitment to 3PH
- possible HOL blocking due to multiplexing of RPC calls on a single QUIC streams.
- I have become somewhat skeptical of my previous claim that this necessarily increases the ambient authority boundary. Each endpoint could, in principle, be guarded by some kind capability-based access token, such that each token is derived from a single source of ambient authority. Whether or not users and administrators will actually respect this design principle is another question altogether.