[RFC] Create Autonomous Lattice Controller
autodidaddict opened this issue ยท 5 comments
Create Autonomous Lattice Controller
This RFC submits for comment the proposal that we create an autonomous lattice controller responsible for managing and reconciling declarative, lattice-wide deployments.
Summary
The lattice control interface is an imperative interface. Manifest files, as they were originally introduced in the pre-0.5/OTP versions of wasmCloud, are also imperative. Manifest file components and lattice control interface commands are all imperative--they instruct a single host to perform a single action. With this API, we can tell hosts to start actors and to stop actors, to start and stop providers, and we can even perform an auction where we ask the entire lattice for a list of hosts containing a set of constraints. The purpose of this RFC is to propose a layer of autonomous operation above the imperative control interface.
Design Detail
The following are preliminary design details for the autonomous lattice controller, which will hereafter simply be called the lattice controller. The proposed design includes the creation of an application that is deployed into a single lattice. This application will be responsible for monitoring and interrogating the observed state of a lattice and, through the use of imperative APIs and other controls at its disposal, issuing the appropriate commands to reconcile the gap between observed state and desired state.
In the case of the lattice controller, the desired state is declared through a set of deployments. As each new deployment is submitted to the lattice controller, it will validate the deployment declaration and then begin managing that deployment through a control loop that consists of comparing the observed state against the desired state and issuing the appropriate commands to reconcile.
Observing the Lattice
Generating a cohesive view of observed state involves a number of resources, including:
- Lattice Event Stream - events emitted by hosts on the
wasmbus.evt.{prefix}
subject contain a number of events that, when applied to aggregates, can produce state. For example, events likeactor_started
andactor_stopped
can be applied to an actor aggregate to update the actor's current state - Heartbeats - all hosts within a lattice emit heartbeats to the lattice. Each of these heartbeats contains an "inventory" of everything that's running within each host. This information can be used as "snapshot" style data that can be compared against computed state to compensate for things like message loss of individual events.
- Lattice Cache - every time a link definition, OCI reference map, or claims token is published into the durable lattice cache stream, consumers can be notified of these changes and thus compute the state of the distributed cache.
- Interrogations - periodically, the lattice controller can interrogate hosts or the lattice at large in order to confirm or refute observations about the state of managed deployments.
Deployment Definitions (Desired State)
A deployment definition contains the following key pieces of information:
- A unique name
- A list of desired actors and their instance counts
- A description of the desired deployment spread of those actors
- A list of desired capability providers and their instance counts
- A description of the desired deployment spread of those providers
Deployments explicitly do not contain link definitions. Link definitions are entities created by operations for runtime actor configuration, and that configuration persists regardless of the number of instances of entities like actors and providers.
Deployment Spreads
A deployment spread is a definition for how a deployment should spread a given entity across the available hosts. Spreads are defined by ratios and a set of label-value pairs that define the constraint for that ratio. For example, if you wanted to ensure that your lattice always had 3 copies of the HTTP Server Provider running, and that 66% of those instances must be running on hosts tagged with zone1
, and 33% of them must be running on hosts tagged with zone2
(a weighted failover scenario), you might define your spread as follows:
- 0.66 -
zone
==zone1
- 0.34 -
zone
==zone2
With such a spread definition, and an instance requirement of 3, the lattice controller would always attempt to ensure that you have 1 instance of the provider running in zone2
and 2 instances of the provider running in zone1
. If the lattice controller determined that the available resources in the lattice don't support the desired spread (e.g. there's only one host listed with the zone:zone1
label), then a fallback policy would be used to either choose a random host (which could also fail if that host is already running that provider) or give up and only partially satisfy the deployment, which would leave the deployment in a failed state.
The various ratios within a spread must always sum to 1. We may come up with more complex ways of defining spreads in the future, but the core definition of a spread is a ratio applied to a set of constraints. Spreads can be applied to the deployment definitions for actors or providers, though actor spreads are more easily satisfied since more than one instance of the same actor can run within a single host.
Scope of a Deployment
Multiple deployments will co-exist within a single lattice, and the resources used by those deployments can co-exist with "unmanaged" resources. Actors and providers that are started manually will be left alone by the lattice controller (with some potential exceptions, discussed next).
As mentioned, a deployment describes a set of actors and providers. Actors and providers that are deployed by a lattice controller will be tagged as such, and the controller will manage only those resources that are part of a deployment.
Dealing with Overlapping Resources
As a general rule, deployments will never interfere with resources that either belong to different deployments or are unmanaged. However, there are a few exceptions that stem from the fact that capability providers are designed to be reused by many actors and a single host cannot contain multiple instances of the same provider (at least not until we get WASI-based providers). These exceptions are:
- If, in order to satisfy a provider spread, the only option is to reuse an existing provider from another deployment, the controller will tag it for the additional deployment. Note that this does not violate multi-tenancy rules or security.
- If, in order to satisfy a provider spread, the controller wants to reduce the number of running instances, it will first look for claimed providers from other deployments and simply un-tag those before requesting providers are terminated.
- The controller will never request the termination of a capability provider for which it is not the sole manager (that provider is both managed and is only tagged with the given deployment)
- If the reason that a provider is solely tagged with a given deployment is that it was previously claimed by that deployment, when instance reduction is required, the provider will be untagged rather than terminated. In other words, a previously unmanaged provider that was claimed to satisfy a deployment will be unclaimed, not terminated, when the deployment is done with it.
Controllers will never claim or reuse unmanaged actors or actors tagged with a different deployment.
Managing Scale (Large Lattices/Many Deployments)
If the lattice controller is ever deployed into a lattice where the rate of change events becomes so high that it cannot process the state change computations and desired imperative command outputs fast enough, then one of two solutions is recommended:
- Split the lattice. If the throughput of the event stream from a single lattice is so high that a single lattice controller cannot keep up with it, then the lattice itself might be too large. Examine the communication patterns between components and see if a lattice split makes sense.
- Scale the controller. The lattice controller is designed to be run as a BEAM cluster. In the 99% case, it's essentially deployed as a singleton, or a "cluster of 1", but if lattice volume becomes incredibly high, one solution might be to simply run more clustered instances of the lattice controller. The controller will distribute deployment management components across the cluster and will never issue duplicate instructions to the lattice control interface API as a result of being clustered. In other words, it's safe to scale the lattice controller and it will still manage its work queues in an idempotent fashion.
Interacting with the Lattice Controller
The lattice controller will expose an API through which deployment definitions can be submitted. Deployments are immutable, and as such, every time you publish a new deployment with a given name, that deployment is given a monotonically increasing revision number.
You can also use the lattice controller's API to query the observed state of the lattice, which can be very handy when building third-party tooling or simply performing routine troubleshooting tasks while doing development and testing.
Deployment Rollbacks
The lattice controller API will also support the ability to roll back a deployment, which is done by specifying the revision number to roll back to.
Drawbacks
The following is a list of identified drawbacks to implementing this RFC as described herein.
Is this just k8s for wasm?
Very early on, wasmCloud drew a line in the sand saying that it would not be responsible for scheduling the host process. It is entirely up to the wasmCloud consumer to figure out how and where and when they would like to start the host process. Once the process is running, we do indeed provide a way to manipulate the contents of the host (through the lattice control interface API).
One possible drawback to this that comes to mind is are we just re-inventing Kubernetes, but for wasm? While Kubernetes does allow the scheduling of far more resources than wasmCloud deployments (which only schedule 2 resource types), a devil's advocate could suggest that we're reinventing a wheel here.
Rationale and Alternatives
The main rationale for this approach is that we want to combine our desire for declarative, simple, "self-managing" deployments with our desire to remain compatible with, not competing against an entire ecosystem of tooling and available to the entire CNCF ecosystem without lock-in. The following is an itemization of some alternative approaches that were considered.
Tight Coupling with Kubernetes
One alternative to building our own lattice controller would be to simply bundle up all of the state observance, reconciliation, and deployment storage logic and stuff it inside a Kubernetes operator. While this approach might leverage more of the existing functionality of Kubernetes as a platform, this approach also prevents this capability from being used by anything other than kubernetes. Does our desire to enable first and third party tooling to manage declarative deployments offset the "isn't this just k8s for wasm" argument? We feel that we can accomplish much more by creating the lattice controller and then exposing the controller's API to thin veneer tooling like a Kubernetes operator than by embedding all of the functionality inside the black box of a "fat operator"
Multiple deployments will co-exist within a single lattice
YES! We identified that we would need this if we wanted to run a wasmcloud operator in each kubernetes cluster. I was hoping that this would be included. ๐
Deployment Spreads
How should this interact with external services like http load balancers? To simplify my life, I would like to be able to
- create a bunch of wasmcloud pods tagged HOST_INTENT=httpserver
- dump a load balancer in front of them with a healthcheck
- say to the lattice controller "please put exactly one http server capability provider on each host that has been tagged with HOST_INTENT=httpserver"
- not have kuberntes kill my pods at random because of failed http health checks, because they're not running the capability provider.
If the reason that a provider is solely tagged with a given deployment is that it was previously claimed by that deployment, when instance reduction is required, the provider will be untagged rather than terminated. In other words, a previously unmanaged provider that was claimed to satisfy a deployment will be unclaimed, not terminated, when the deployment is done with it.
I haven't quite wrapped my head around overlapping resources yet (I hadn't even considered it as a possibility when thinking though the requirements for a wash ctl apply
.
As a worked example, What happens here?:
Deployment A.1 => 10 providers plz => 10 providers claimed by A
Deployment B.1 => 10 providers plz => 10 providers claimed by A and tagged by B
Deployment A.2 => 0 providers plz => ??? 10 providers claimed by nobody and tagged by B
Deployment B.2 => 0 providers plz => ??? 10 zombie providers?
Deployments explicitly do not contain link definitions. Link definitions are entities created by operations for runtime actor configuration, and that configuration persists regardless of the number of instances of entities like actors and providers.
Link definitions are the thing that killed us repeatedly when building https://github.com/redbadger/wasmcloud-k8s-demo. I recognise that things are improved in -otp, but it feels like we will still need a way to declarative way to specify links.
Could the resource tagging strategy that's being proposed for capability providers also be applied to links?
I also have an impression in my head that links can be constrained to capability providers with a given set of tags, but I suspect I might have dreamed that one. I think you want to be able to specify these two types of link:
- make a link between these actors that I control and these capability providers that I control
- make a link between all actors of this type and all capability providers of this type (would probably need some kind of reference counting, so that it gets removed when the last controller asking for it goes away?)
I also realise that it's the things you don't build that make a product successful, so I'm happy for this bit to be dumped into the river of dreams if it helps to get the rest out of the door more quickly.
- ๐
- that looks exactly like what we're intending to support and is precisely the scenario I had in mind when using host labels in lattice auctions!
- essentially, providers are started when none match a deployment spec. If the spread definition for providers requires more be started, more will be started. If the lattice essentially runs out of room for new providers, then (and only then) will the lattice controller look for reusable providers to claim. Claims are relinquished on providers as the instance count of providers drops/spread specification changes. Only providers that were started by the controller for a given deployment are terminated by that deployment.
but it feels like we will still need a way to declarative way to specify links.
Link definitions are already declarative.. they're actually one of the few things you can do with the control interface that is (lattice) global and declarative. For now, until we find a need, the link definition is defined (keyed) as "apply this set of configuration values between all actor instances with public key X and provider instances with contract ID y and link name z"
I think your comment implies that we want to enhance the spread
configuration so that you can easily specify not just a ratio by which instances are pulled from a fixed instance count, but a way to define a spread that dictates that you want 1 per matching host
, regardless of the number of instances that means. This doesn't make much sense for actors, but, for server-class capability providers, makes total sense.
not have kuberntes kill my pods at random because of failed http health checks, because they're not running the capability provider.
The OTP version has /livez
and /readyz
endpoints, both of which reflect the liveness and readiness of the host, and so will not get your hosts/pods killed because they're not running the HTTP provider.
I am considering using the Open Application Model as a means of declaring application deployments to the lattice controller. Would love some feedback on whether folks think this is a suitable use case for OAM or not. OAM is supported by Microsoft and Alibaba Cloud
This is implemented by https://github.com/wasmCloud/wadm and further iteration will be tracked there