Sample Kubernetes controller for Flink resources. The documentation here currently only discusses deployment with Custom Resource Definition (CRD) targeting Flink's application mode with native Kubernetes integration. The main uses cases are:
- Streaming Flink applications that desire savepoint management during redeployment, e.g. after upgrades or scaling.
- Batch Flink applications that may desire different levels of parallelism each time they run.
Everything here is a Proof of Concept and has been tested against Kubernetes 1.21, but the included unit tests are mostly just sanity checkers.
Note: the project can be opened in IntelliJ by opening the root build.gradle
as a Project.
It currently builds with Java 17 as configured here and here.
For a basic build simply run:
./gradlew build
The YAML template is here, with the corresponding POJO here. Note that the embedded Kubernetes resources use the schema from the official OpenAPI spec, see also the flattening helper.
Note that the CRD is not in the special crds
Helm folder.
The chart with the controller must be installed before any consumers of the CRD,
so it cannot be a sub-chart of any other chart that intends to use the CRD.
The flow is as follows:
- A handler receives resource updates from Kubernetes and delegates to a reconciler. One handler instance will manage either 1 or all namespaces, more on this in the multi-tenancy section below.
- The reconciler manages phasers for the different resources and basic cleanup.
- The phaser decides which phases to execute and in which order.
High availability of the controller(s) is supported and the default number of replicas is 2.
The actual deployment is done by delegating to the classes from flink-kubernetes
.
However, they are shadowed in the flink-kubernetes-shadow
module
so that a different version of Fabric8's Kubernetes Client can be used for the controller.
Note that the controller tries to communicate with Flink's job manager with the official client
(to stop/cancel or fetch job status),
and it assumes that, if TLS is enabled for REST communication in the Flink cluster,
the controller gets Flink's CA certificate injected in a trust store,
and Flink will trust the controller's certificate if needed;
see getDescriptorWithTlsIfNeeded
in FlorkUtils
and this sample script.
A class for resource validation is available. The sample Helm chart takes this into account.
A very simple multi-tenancy model with Kubernetes Namespace granularity is supported.
During installation, the user could specify different tenants, each of which can manage 1 or more namespaces.
Each tenant gets its own Kubernetes Deployment with a corresponding replica set (with 2 replicas each by default),
each of which would only process resources from the namespace(s) it manages.
Alternatively, a single tenant can manage all namespaces with the special namespace name *
.
These tenants and their namespaces would be specified via Helm values under deployment.tenants
,
for example:
deployment:
tenants:
- id: first
namespaces:
- foo
- id: second
namespaces:
- bar
- baz
All templates from the sample Helm chart take this into account. In particular, the RBAC definition will bind different Service Accounts to specific namespaces, so they can only manage resources there.
A caveat: the validation step described above can be done by any controller regardless of tenant. It's not possible to constraint validation to specific instances because that would allow creation of resources that could skip validation.
To deploy a Flink cluster, 3 resources are prepared locally by the controller:
flink-conf.yaml
and two pod templates for job and task manager(s);
see prepareConfFilesFromSpec
in here.
Before writing the corresponding files to (local, ephemeral) disk,
the code will search for decorators using Java's ServiceLoader
features.
Thus, a consumer could take a base container image and extend it with custom logic by simply adding jars with implementers of the decorator interfaces.
The decorators get an instance of florkConf
which,
since the CRD specifies x-kubernetes-preserve-unknown-fields: true
for it,
may contain custom properties and/or nested objects.
Some helper classes for IoC are included in the flork-controller-ioc
module.
For example, the REST controller for validation endpoints can be found
here,
and the Kubernetes controller here.
A sample module with bindings for Spring Boot is available in the microservice module.
A corresponding Dockerfile
is also provided.
Docker's context can be prepared with:
./gradlew build setUpDockerContext
It will be configured in build/docker/
. The image can then be built with something like
docker build ... -f docker/Dockerfile build/docker
For example, if you have a minikube instance with the registry addon, you could use these commands:
docker build -t "$(minikube ip):5000/test/itom-flork:latest" -f docker/Dockerfile build/docker
docker push "$(minikube ip):5000/test/itom-flork:latest"
In addition to what has been mentioned above,
the Helm chart includes some sample certificates for the webhook endpoints,
but they are only valid if you install in a namespace called flork
.
You could install with a command like
helm -n flork install flork helm/flork --set images.florkService.imageTag=latest
If, for whatever reason, you build newer images of the controller,
change the tag to something like latest.1
and adjust the helm
call accordingly.
See the kubernetes folder.
If you're using minikube with the registry addon as mentioned above,
push the Flink image to the local registry as well;
from Kubernetes' point of view in minikube,
the local registry is accessible at localhost:5000
.
- Ingress.
- Java Operator SDK.
See here.