Support Declarative Deployments in Lattice Mode
autodidaddict opened this issue · 1 comments
This issue cannot be completed until the following pre-requisites have been met:
- #54 - Trapping signals for graceful shutdown (and emitting appropriate events)
- #49 - Emitting events on the lattice for actor, host, and capability activities
- #62 - Support for the waSCC host to be given an arbitrary set of annotations at startup. These annotations will be discoverable via inventory probes and used to resolve affinity/anti-affinity in future features.
- #63 - Create a control plane protocol used by lattice that supports a schedule auction workflow for capability providers and actors to allow them to be deployed via
latticectl
or via API by interacting with the appropriate lattice subjects/topics - #66 - Create the
vinculum
, a process that is responsible for managing and comparing desired state (fed byapply
of different entity types, e.g.Deployment
andBindingSet
) against observed state and taking corrective action. This is an autonomous agent that only works on a lattice.
With support for declarative deployments, any waSCC host in the lattice should be able to start up empty, without any capability providers or actors loaded. These hosts would then await scheduling instructions from the lattice to coordinate the deployment and activation of capability providers and actors from a central repository (a gantry instance).
Deployment Workflow
A developer/operator may start with a collection of unused host resources, which could include any of the following in fresh-from-start (no "user" processes running) mode:
- Cloud VMs
- Kubernetes Nodes
- Developer Laptop(s)
- Raspberry Pi
- Multiple hosts spanning multiple Clouds (AWS, Azure, DO, Google, etc)
- Constrained Devices that can still run waSCC host (today that means
aarch64-unknown-linux-gnu
but we will hopefully add more in the future)
The user can then start a NATS process (leaf node or isolated) and then, through whatever tooling they choose, start a waSCC host process on all of their infrastructure hosts configured to point the lattice at the NATS server. At this point, they may issue the following command:
latticectl apply mydeployment.yaml
This should look eerily familiar to Kubernetes developers and, if we do our jobs right, it should be significantly easier and impose little to no friction on the users. The following is a sample deployment file:
kind: Deployment
metadata:
name: amazing-deployment
labels:
app: amazesauce-the-sequel
spec:
actors:
- id: MADJKE....DASKJ2334Z
providers:
- id: wascc:messaging
binding_name: default
vendor_id: nats
affinity_rules: []
- id: iot:ledcontroller
binding_name: default
vendor_id: wascc
affinity_rules: ["leds=12"]
Note that bindings do not have to be created at the same time as the deployment. Bindings and deployments are durable, first-class entities within a lattice as of the completion of this feature. Here's what a binding set might look like:
kind: BindingSet
metadata:
name: amazing-bindings
labels:
app: amazesauce-the-sequel
spec:
bindings:
- capability_id: wascc:messaging
binding: default
actor: MADJKE...DASKJ2334Z
vendor_id: nats
values:
SUBSCRIPTION: foo.bar.baz
VALUE4: ${ENV_VAL4:FOURTYTWO}
Once these deployments and binding sets are apply
'd , the process responsible for maintaining observed and desired state will update the desired state accordingly. Given that these are apply
, all of these declarations are idempotent. Applying a BindingSet or Deployment with the same name will over-write any previously existing entity by the same name, update desired state accordingly, and then allow the vinculum
to take whatever actions are necessary to close the gap between desired and observed state.
The lattice CLI should be updated to add functions for querying the binding sets and deployments within a lattice:
latticectl get deployments
latticectl get bindings/bindingsets
These would not only display the values of the desired state, but also the observed so developers and operators can see the gap being closed if work is being done, much the same way they would observe the number of running pods after creating a deployment in Kubernetes, etc.
Note the bulk of the work for this issue might be done in the pre-requisites and in other repositories like Gantry and lattice-client, but this issue should serve as the main set of high-level acceptance criteria for this feature since GitHub doesn't support the notion of an epic.
TODO: evaluate the use of OAM as a standardized and open model way of declaring actor deployments, capability provider deployments, and bindings (configurations) between actors and providers.