vmware-archive/kubeless

Allow arbitrary Kubernetes resources to be deployed with a Function

andrascz opened this issue · 12 comments

Is this a BUG REPORT or FEATURE REQUEST?: feature request

What happened: To my knowledge there is currently no way to define a PodDistruptionBudget for a Function object.

What you expected to happen: Be able to define a PodDisruptionBudget for a Function either in the Function object or in the ConfigMap.
Provide a method to define auxiliary Kubernetes resources for a Function with a default or based on arbitrary number of defined base profiles which could be overridden per Function.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
A potential way of doing this is deploying functions as Helm charts. This allows Kubeless to unbound itself from the Kubernetes API in relation to Function resource deployments, see #1197 for benefits of doing that.
According to the Helm docs integrating with Helm should not be a big deal.

Environment:

  • Kubernetes version (use kubectl version):
  • Kubeless version (use kubeless version):
  • Cloud provider or physical cluster:

Going further along this idea, would it be even more beneficial/flexible if someone could also define other resources like istio VirtualService or DestinationRule for every function?
So basically defining a Function manifest template in the ConfigMap which would include some required resources like Service/Deployment and other arbitrary ones like HPA, PDB, istio.VirtualService, istio.DestinationRule, etc.
and this manifest template could be overridden from the Function resource. Some templating could be done too like in helm charts.
Integrating helm to implement this could be a potential option.

Going further along this idea, would it be even more beneficial/flexible if someone could also define other resources like istio VirtualService or DestinationRule for every function?
So basically defining a Function manifest template in the ConfigMap which would include some required resources like Service/Deployment and other arbitrary ones like HPA, PDB, istio.VirtualService, istio.DestinationRule, etc.
and this manifest template could be overridden from the Function resource. Some templating could be done too like in helm charts.
Integrating helm to implement this could be a potential option.

Use helm chart is a good way.
I use the helm chart to deploy the funtion with istio gateway and virtservice :)

I use the helm chart to deploy the funtion with istio gateway and virtservice :)

Sure that is also a way of doing this. The question is which is better UX?

  1. leaving the creation of a Helm chart to the user: this seems the leaner approach and is possible as of now, but needs helm skills.
  2. providing an interface for that through the Function resource and the ConfigMap (or maybe a separate configmap than kubeless-config, there are many ways to skin this cat), and use helm to deploy resources instead of doing it directly: using Helm charts via a Helm client library instead of homebrew deployment code could simplify function deployment from the Kubeless point-of-view, and detach Kubeless from the Kubernetes API, see #1197 on why that could matter.

On a side note it would be also great if Kubeless itself could be installed via a Helm chart. Currently I wrote a script which converts the released manifest yaml into a chart. Ksonnet is dead anyways.

It seems the helm integration would not be too hard: https://helm.sh/docs/topics/advanced/#go-sdk

I updated the description of this Issue to reflect the further ideas mentioned here. I will raise separate issues for Helm chart deployment of Kubeless and other related things.

My suggestion would be to implement controllers per target feature. Let's say, if you want to be able to extend Kubeless and be able to create PodDisruptionBudgets, you just need to create a K8s controller, listen for Function objects, and if a function contains a specific key, let's say podDisruptionBudget, create the specific Kubernetes resources.

This is the way Kubeless trigger controllers are implemented for example (with the difference that Triggers have their own CRDs).

Sure trigger controllers are doing what you are suggesting. But HPA is implemented inside the function controller. It seems this is not so clear cut.
I do feel though that the triggers which are just creating other resources (HTTP and CronJob) could be moved into the helm chart of a function. The deploying functions as Helm charts solution would be more flexible. Lets say the CronJob or the Ingress resource gains some new features in a later Kubernetes release. Then those triggers would need to be updated to provide that feature. Sure Kubeless can say to its users that implement a new controller and create the resources as needed utilizing those new features, but I think it would be way simpler to be able to just be opaque about this from the kubeless point of view.
I think even the kubeless cli could remain functional as if the already available autoscale and HTTP/cronjob trigger commands could just change some helm chart configuration to enable/disable those features and apply the diff to the helm release.

I started using Kubernetes a year ago so maybe I have a different point of view or not really understanding the Kubernetes way or I have a very special use case, but for me a Function is a nano application where sure the developer only needs to provide the code, but the operational concerns are not going away with that. The widespread use of Helm and other tools providing a way handling resources together instead of one by one seems to back this.
Providing a way to bundle the function code together with that infrastructural/operational environment in the function controller seems to be more efficient to me than replicating all the boilerplate of watching functions and adding/updating/removing resources independently of the resources the function controller operates on at every organization which uses Kubeless.

I think there would be multiple benefits of this proposal:

  1. providing a way to support the wide range of infrastructure context in which a Function runs like different service meshes
  2. less dependency on the Kubernetes API which means less/no effort to support features released in the future
  3. coherent handling of a Function and its related resources.

I am not insisting on Helm, that is an implementation detail for me. I proposed that because I have worked with Helm and it seems to be able to do the job. I have never worked with Kustomize, but if this is achievable with it or with some other way/tool I am all for it.

Sure trigger controllers are doing what you are suggesting. But HPA is implemented inside the function controller. It seems this is not so clear cut.

I agree, having HPA within the function-controller is not the cleanest, mostly because it doesn't scale. It should be possible to add new entities (like podDisruptionBudget) without modifying the function controller.

I do feel though that the triggers which are just creating other resources (HTTP and CronJob) could be moved into the helm chart of a function.

I think I am not following what you are proposing. Maybe you can put an example? Are you suggesting that the user provides a Helm chart rather than a function or that Kubeless generate a Helm chart?

Lets say the CronJob or the Ingress resource gains some new features in a later Kubernetes release. Then those triggers would need to be updated to provide that feature.

Not necessarily. Depends on the implementation of the trigger, you can allow arbitrary input that can be just forwarded to the target object (e.g. function.spec.ingress.spec --> ingress.spec).

  1. providing a way to support the wide range of infrastructure context in which a Function runs like different service meshes

I don't understand what you mean here :) If the people need to define their own manifest with all the infra, kubeless would not be helping that much.

  1. less dependency on the Kubernetes API which means less/no effort to support features released in the future

That's not necessary a benefit. Kubeless is meant to be Kubernetes native, so anything that goes far from that statement is not aligned with the product.

  1. coherent handling of a Function and its related resources.

This is something that you can already get with Kubernetes resources, for example using OwnerReferences.

Thanks for the responses, I am already learning new things here, like OwnerReferences.

It should be possible to add new entities (like podDisruptionBudget) without modifying the function controller.

HTTPTrigger and CronJobTrigger are just indirectly creating Ingress or CronJob resources. I do not see any difference between a HPA, PDB or Ingress/CronJob resource. As I see all of them are resources which are defined by the creator of the Function.
The same way you can define a PDB or HPA should allow to be able to define Ingress or CronJob resources. That would eliminate the need for the HTTP or CronJob triggers. For example if someone uses Istio, then VirtualService is the resource you need to expose a function to the outside world via HTTP. Kubeless should be agnostic about this imho. It is easier to let the user define the resource type needed for this than trying to support all the different ways a Function could be exposed via HTTP in different environments.

Maybe you can put an example?

Kubeless is a FaaS solution. Its goal is to hide all the operational complexities of implementing an RPC. Hiding does not mean they are not there. Currently Kubeless has an opinion about this operational complexity in the form of the deployed resources: Service, Deployment, etc. My proposal about this opinion. I think it is fine to have a minimal and simplistic opinion to make it easy to try Kubeless out and to get the simple use cases covered.
Regardless the operational complexity is still there and in my opinion Kubeless should allow more advanced users to be able to tap into that complexity. Small projects are typically involving only developers who just want to have some functions run or implement an serverless application. A bit bigger projects typically have someone who concerns about functions but have separate people who are concerning themselves about operation those functions. Giving them a head start by Kubeless is nice. But those operational people will want more, whether they are traditional Ops teams or DevOps, does not matter. They have a service mesh, they have their special way of enabling HTTP access to Services, they want other extra stuff bound to a function Sure writing an extra Kubernetes resource controller which watches Functions and implements this is a possible way of solving this problem. But those people might not be developers.
There are two common FaaS architectures (1, 2), but there are other architectures as well:

  1. hybrid architecture: application components are using functions internally, so both the inputs and functions are defined by the developer.
  2. public serverless application: the developer defines the functions and the user defines the inputs
  3. user extended architecture: when a web application want to provide the option to its users to extend its functionality, it might allow users to define functions which are called at different points of the business logic implementation. In this case the functions are defined by the user and the inputs are defined by the developer.
  4. public FaaS (AWS Lambda, Azure Functions, Google Cloud Functions, etc.): in this case both inputs and functions are provided by the user.

Obviously Kubernetes is not a public FaaS, where the only concerns of developers are operational ones (security, SLA and others), but I think in the other three architectures there are still existing operational concerns, where developers might want some configuration possibilities.

Currently Kubeless allow its users to provide arbitrary function code and its dependencies and the runtime where it can be executed. But there are limited possibilities to define the infrastructure how it should be operated. One can add triggers, customize the deployment and provide some scaling and that is it.

I am not sure about the implementation, but this proposal is mainly about adding that capability, which allows those users to address these concern by bundling arbitrary resources to a Function. As I am using Helm for the same purpose in case of application deployments to a Kubernetes cluster and a Function really is a nano application, it seems to me that Helm charts could be an implementation candidate. I do not know much or have much experience with Kustomize, but that could be also a solution.

Are you suggesting that the user provides a Helm chart rather than a function or that Kubeless generate a Helm chart?

The user provides a Function as it is now, and provides a separate resource collection (lets call it bundle) like it provides the runtime for a function as the resource bundle the functions should be deployed with. Sidenote: the runtimes could be custom resources too instead of part of the configmap.
From the Function object those bundle resources could be overridden like what a user can do with the Deployment at the moment.

Not necessarily. Depends on the implementation of the trigger, you can allow arbitrary input that can be just forwarded to the target object (e.g. function.spec.ingress.spec --> ingress.spec).

In case of deployments this is not necessarily forwarded, see: #1197. As Kubeless serializes that spec into an internal object, and if the API moved forward, the new features are not supported, until Kubeless updates its internals to support it.

don't understand what you mean here :) If the people need to define their own manifest with all the infra, kubeless would not be helping that much.

See my explanation above about operational concerns and different FaaS architectures.

That's not necessary a benefit. Kubeless is meant to be Kubernetes native, so anything that goes far from that statement is not aligned with the product.

Although I do not view it as it makes Kubeless less Kuberenetes native, then more effort is needed to keep Kubeless supporting the latest Kubernetes API/resources.

ps.: I renamed the issue again to reflect more the gist of what I am trying to discuss here.

I have just found two projects which might be of interest in relation with this discussion:

Thanks for the detailed thoughts @andrascz! Will try my best to reply them.

The same way you can define a PDB or HPA should allow to be able to define Ingress or CronJob resources. That would eliminate the need for the HTTP or CronJob triggers. For example if someone uses Istio, then VirtualService is the resource you need to expose a function to the outside world via HTTP. Kubeless should be agnostic about this imho. It is easier to let the user define the resource type needed for this than trying to support all the different ways a Function could be exposed via HTTP in different environments.

One of the goals of Kubeless is to abstract the unneeded complexity of Kubernetes to deploy functions and apply the serverless paradigm to this ecosystem. That's why we split between triggers and functions. Apart from that, we take an opinionated approach so for example, the user don't need to know how to define a kubernetes CronJob and just need to create a CronJobTrigger giving a crontab.

Having said that, I am okay giving the user the opportunity to extend to whatever is not covered by the current triggers: e.g. Istio VirtualService.

I generally agree with your reasoning so the only thing I would discuss is the implementation details. I still struggle to see how the helm chart packaging would help for extending the supported resources. As far as I can tell, you are suggesting for people to provide a helm chart and Kubeless would install it but probably it's not exactly that.

In any case, we are talking about two different things.

  1. The Kubeless API. This is the function YAML (custom resource). What I would do here is to allow arbitrary keys under the function spec. Let's say:
apiVersion: kubeless.io/v1beta1
kind: Function
metadata:
  name: foo
spec:
  ...
  virtualService:
    foo: bar
  1. The controller. This is the actor that receives API requests and act on them. Typically, this is a Kubernetes Controller in the k8s world. For the example above, I would have a controller that listens for kubeless.io/v1beta1/functions and checks the key virtualService. Then act in consequence, it may generate one or more resources based on that info.

This is as extensible as people want but if you want to add a new feature (e.g. backupPlan) you would need to create a controller for it. The good thing about this approach is that you can still hide complexity to the user and define your own API.

If you just want to have a generic controller and install whatever is in the function spec, you can do that as well, in this case, the YAML would look something like:

apiVersion: kubeless.io/v1beta1
kind: Function
metadata:
  name: foo
spec:
  ...
  extraResources:
    - apiVersion: foo.bar/v1
      kind: Foo
      metadata:
      ...
    - apiVersion: foo.bar/v1
      kind: Bar
      metadata:
      ...

And then having the controller go over these extraResources and simply install them in the cluster (even though the benefit of doing that vs creating the resources on your own is minimal). Also, this would be installing resources in the cluster in the name of the user so it can easily lead to a security issue of escalation of privileges so I would much rather implement the first suggestion in which you control what the user can install.

Hope this clarifies how I see this feature working.